Reference Documentations
Prompt Optimizer
- class prompt_optimizer.poptim.AutocorrectOptim(fast: bool = False, verbose: bool = False, metrics: list = [])[source]
AutocorrectOptim is a prompt optimization technique that applies autocorrection to the prompt text. Correctly spelled words have less token count than incorrect ones. This is useful in scenarios where human client types the text.
It inherits from the PromptOptim base class.
Example
>>> from prompt_optimizer.poptim import AutocorrectOptim >>> p_optimizer = AutocorrectOptim() >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- class prompt_optimizer.poptim.EntropyOptim(model_name: str = 'bert-base-cased', p: float = 0.1, verbose: bool = False, metrics: list = [], **kwargs)[source]
EntropyOptim is a prompt optimization technique based on entropy values of tokens. A masked language model (bert-base-cased by default) is used to compute probabilities of observing the current token based on right and left context. These probability values are further used to compute the entropy values. Optimizer then moves on to remove the tokens corresponding to lowest p percentile entropies.
The intuition of this method is that the model can infill low entropy i.e. low surprise or highly probable tokens through the context. I will probably write a paper to explain this in more detail.
EntropyOptim inherits from the PromptOptim base class.
Example
>>> from prompt_optimizer.poptim import EntropyOptim >>> p_optimizer = EntropyOptim(p=0.1) >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- generate_confidence_values(sentence: str) list [source]
Generates entropy values for each token in the sentence.
- Parameters
sentence (str) – The input sentence.
- Returns
A list of tuples containing token IDs and their corresponding entropy values.
- Return type
list
- optimize(prompt: str) str [source]
- Runs the prompt optimization technique on the prompt.
Args: prompt (str): The prompt text.
- Returns
The optimized prompt text.
- Return type
str
- class prompt_optimizer.poptim.LemmatizerOptim(verbose: bool = False, metrics: list = [])[source]
LemmatizerOptim is a prompt optimization technique based on lemmatization.
It inherits from the PromptOptim base class.
Example
>>> from prompt_optimizer.poptim import LemmatizerOptim >>> p_optimizer = LemmatizerOptim() >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- class prompt_optimizer.poptim.NameReplaceOptim(verbose: bool = False, metrics: list = [])[source]
NameReplaceOptim is a prompt optimization technique based on replacing names in the prompt. Some names have lower token count (1) than others. Higher token count names can be replaced by such names to reduce token complexity. self.opti_names contains the pre-made list of such names for tiktokenizer. The list will need to be modified for other tokenizers.
It inherits from the PromptOptim base class.
Example
>>> from prompt_optimizer.poptim import NameReplaceOptim >>> p_optimizer = NameReplaceOptim() >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- gen_name_map(text: str) dict [source]
Generates a mapping of names in the prompt to optimized names.
- Parameters
text (str) – The prompt text.
- Returns
The mapping of names to optimized names.
- Return type
dict
- get_opti_names() list [source]
Retrieves the list of optimized names.
- Returns
The list of optimized names.
- Return type
list
- opti_name_replace(text: str, mapping: dict) str [source]
Replaces names in the text with optimized names based on the mapping.
- Parameters
text (str) – The text to perform name replacement.
mapping (dict) – The mapping of names to optimized names.
- Returns
The text with replaced names.
- Return type
str
- class prompt_optimizer.poptim.PromptOptim(verbose: bool = False, metrics: list = [], protect_tag: str = None)[source]
PromptOptim is an abstract base class for prompt optimization techniques.
It defines the common structure and interface for prompt optimization.
This class inherits from ABC (Abstract Base Class).
- abstract optimize(prompt: str) str [source]
Abstract method to run the prompt optimization technique on a prompt.
This method must be implemented by subclasses.
- Parameters
prompt (str) – The prompt text.
- Returns
The optimized prompt text.
- Return type
str
- run_json(json_data: list, skip_system: bool = False) dict [source]
Applies prompt optimization to the JSON request object.
- Parameters
json_data (dict) – The JSON data object.
- Returns
The JSON data object with the content field replaced by the optimized prompt text.
- Return type
dict
- run_langchain(langchain_data: list, skip_system: bool = False)[source]
Runs the prompt optimizer on langchain chat data.
- Parameters
langchain_data (list) – The langchain data containing ‘type’ and ‘content’ fields.
skip_system (bool, optional) – Whether to skip data with type ‘system’. Defaults to False.
- Returns
The modified langchain data.
- Return type
list
- class prompt_optimizer.poptim.PulpOptim(p: float = 0.1, verbose: bool = False, metrics: list = [])[source]
PulpOptim is a prompt optimization technique based on integer linear programming using the Pulp library.
It inherits from the PromptOptim base class.
Example
>>> from prompt_optimizer.poptim import PulpOptim >>> p_optimizer = PulpOptim(p=0.1) >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- class prompt_optimizer.poptim.PunctuationOptim(verbose: bool = False, metrics: list = [], **kwargs)[source]
PunctuationOptim is a prompt optimization technique that removes punctuation marks from the prompt. LLMs can infer punctuations themselves in most cases, remove them.
It inherits from the PromptOptim base class.
Example
>>> from prompt_optimizer.poptim import PunctuationOptim >>> p_optimizer = PunctuationOptim() >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- class prompt_optimizer.poptim.Sequential(*optims: prompt_optimizer.poptim.base.PromptOptim)[source]
Sequential is a class that represents a sequential composition of prompt optimization techniques.
It applies a series of optimization techniques in sequence to the prompt.
Example
>>> optim1 = SomeOptimizationTechnique() >>> optim2 = AnotherOptimizationTechnique() >>> seq = Sequential(optim1, optim2) >>> optimized_prompt = seq(prompt)
- Parameters
*optims – Variable-length argument list of prompt optimization techniques.
- optims
A list of prompt optimization techniques.
- Type
list
- class prompt_optimizer.poptim.StemmerOptim(verbose: bool = False, metrics: list = [])[source]
StemmerOptim is a prompt optimization technique that applies stemming to the prompt.
Stemming reduces words to their base or root form, removing suffixes and prefixes.
Example
>>> from prompt_optimizer.poptim import StemmerOptim >>> p_optimizer = StemmerOptim() >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- class prompt_optimizer.poptim.StopWordOptim(verbose: bool = False, metrics: list = [])[source]
StopWordOptim is a prompt optimization technique that removes stop words from the prompt.
Stop words are commonly used words (e.g., “the”, “is”, “in”) that are often considered insignificant in natural language processing tasks.
Example
>>> from prompt_optimizer.poptim import StopWordOptim >>> p_optimizer = StopWordOptim() >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- class prompt_optimizer.poptim.SynonymReplaceOptim(verbose: bool = False, metrics: list = [], p: float = 0.5)[source]
SynonymReplaceOptim is a prompt optimization technique that replaces words in the prompt with their synonyms.
Synonyms are words that have similar meanings to the original word. Sometimes a synonym has lower token count than the original word.
Example
>>> from prompt_optimizer.poptim import SynonymReplaceOptim >>> p_optimizer = SynonymReplaceOptim() >>> res = p_optimizer("example prompt...") >>> optimized_prompt = res.content
- get_word_pos(word: str) str [source]
Get the part of speech of a word.
- Parameters
word (str) – The word.
- Returns
The part of speech of the word.
- Return type
str
Metrics
- class prompt_optimizer.metric.BERTScoreMetric[source]
BERTScoreMetric is a metric that calculates precision, recall, and F1 score based on BERT embeddings. It inherits from the Metric base class.
Example
>>> from prompt_optimizer.metric import BERTScoreMetric >>> metric = BERTScoreMetric() >>> res = metric("default prompt...", "optimized prompt...")
- run(prompt_before: str, prompt_after: str) dict [source]
Calculates precision, recall, and F1 score based on BERT embeddings.
- Parameters
prompt_before (str) – The text before the prompt.
prompt_after (str) – The text after the prompt.
- Returns
A dictionary containing the precision, recall, and F1 score.
- Return type
dict
- class prompt_optimizer.metric.Metric[source]
- batch_run(prompts_before: list, prompts_after: list, skip_system: bool = False, json: bool = False, langchain: bool = False) float [source]
Runs the metric on a batch of prompts.
- Parameters
prompts_before (list) – List of prompts before the modification.
prompts_after (list) – List of prompts after the modification.
skip_system (bool, optional) – Whether to skip prompts with “system” role. Defaults to False.
json (bool, optional) – Whether the prompts are JSON data. Defaults to False.
langchain (bool, optional) – Whether the prompts are langchain chat data. Defaults to False.
- Returns
The average metric value across the batch.
- Return type
float
- abstract run(prompt_before: str, prompt_after: str) dict [source]
Abstract method to run the metric on the given prompts.
- Parameters
prompt_before (str) – The prompt before the modification.
prompt_after (str) – The prompt after the modification.
- Returns
The result of the metric computation.
- Return type
dict
- run_json(json_data_before: dict, json_data_after: dict) dict [source]
Runs the metric on the content of JSON data.
- Parameters
json_data_before (dict) – JSON data before the modification with “content” key.
json_data_after (dict) – JSON data after the modification with “content” key.
- Returns
The result of the metric computation.
- Return type
dict
- class prompt_optimizer.metric.TokenMetric(tokenizer: str = 'cl100k_base')[source]
TokenMetric is a metric that calculates the optimization ratio based on the number of tokens reduced. It uses tiktoken to tokenize strings and count the number of tokens.
It inherits from the Metric base class.
Example
>>> from prompt_optimizer.metric import TokenMetric >>> metric = TokenMetric() >>> res = metric("default prompt...", "optimized prompt...")
- run(prompt_before: str, prompt_after: str) dict [source]
Calculates the optimization ratio based on the number of tokens.
- Parameters
prompt_before (str) – The text before the prompt.
prompt_after (str) – The text after the prompt.
- Returns
A dictionary containing the optimization ratio.
- Return type
dict