-
Notifications
You must be signed in to change notification settings - Fork 65
[llm unify 1/n] Add consolidated prompt classes #1120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
…kwargs Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks fine, I'm curious about how the existing transforms/uses are going to map to this. I worry a bit about usability, do you think you can transition one of the existing docset transforms to this stuff in this PR to demonstrate?
""" | ||
format_args = self.kwargs | ||
format_args["doc_text"] = doc.text_representation | ||
format_args.update({"doc_property_" + k: v for k, v in doc.properties.items()}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anything special we want to do for nested properties? lots of stuff is in properties.entity.* so single level flattening might not do much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also why are we flattening instead of accepting the dot notation? I think it's more intuitive to send a prompt like:
prompt = ElementListPrompt(
system = "Hello {name}. This is a prompt about {document.properties.path}"
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python string.format is rather limited, can't do any of
"hello {document.properties.name}".format(document=doc, document.properties=props, document.properties.name='henry')
- it doesn't do function calls or dict lookups
- can't have dotted param names.
which maybe means we shouldn't be using string.format here.
I did at one point play around with doing something like
s = "Hello {name}. this is a prompt about {document.properties.path}"
fstring = f"f\"{s}\""
formatted = eval(fstring, abunchofglobals)
but that seems dangerous bc you could maybe do something like
s = "{os.system('rm -rf /')}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and I don't want to invent a formatting/templating system
def __init__( | ||
self, | ||
*, | ||
system: Optional[str] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're supporting system and user prompts (i.e. the messages api), shouldn't we be supporting a list then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I think we can limit to only one system prompt (I don't remember what the providers do but that seems sensible) though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
idk if it makes sense to translate an op until llm_map exists but to sketch it out extract_entity would look like something like # default_prompts.py
EntityExtractorFewShotGuidancePrompt = ElementListPrompt(
system="You are a helpful entity extractor",
user="""You are given a few text elements of a document. The {entity} of the document is in these few text elements. Here are
some example groups of text elements where the {entity} has been identified.
{examples}
Using the context from the document and the provided examples, FIND, COPY, and RETURN the {entity}. Only return the {entity} as part
of your answer. DO NOT REPHRASE OR MAKE UP AN ANSWER.
{elements}
""",
) (instead of the class construction as in default_prompts) def extract_entity(self, entity_name, examples):
prompt = default_prompts.EntityExtractorFewShotGuidancePrompt.instead(
entity=entity_name, examples=examples
)
return self.llm_map(prompt, output_field=entity, llm_and_stuff_that_would_also_be_args_to_extract_entity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename instead
to something else like populate
or something
Signed-off-by: Henry Lindeman <hmlindeman@yahoo.com>
Creates a SycamorePrompt interface for rendering documents/elements as llm calls and adds a few implementations.
These are not the only ones to be implemented but these are necessary to move a number of the existing transforms to use llm_map (this prompt interface)