Summarize with Various AI Models
Simon-Pierre Boucher
2024-09-14
This Python script is designed to summarize a given text using multiple language models provided by OpenAI, Anthropic, and Mistral APIs. Here’s a detailed breakdown of its key components:
1. Environment Setup:¶
load_dotenv()
: Loads environment variables, including the API keys for OpenAI, Anthropic, and Mistral. This allows secure storage of sensitive information like API keys, which are required to access these services.
2. Text Summarization Functions:¶
openai_summarize_text()
:- Sends the input text to OpenAI's models (e.g.,
gpt-4
) for summarization. - The function constructs a prompt that includes the task description ("Summarize the following text") and the actual input text.
- It makes an API request to OpenAI and retrieves the summarized text.
- Configurable parameters like temperature, max tokens, and stop sequences can be adjusted.
- Sends the input text to OpenAI's models (e.g.,
anthropic_summarize_text()
:- Similar to the OpenAI function, it sends a request to the Anthropic API (e.g.,
claude-3-5-sonnet
), asking for a summary of the provided text. - The function handles the API request and returns the generated summary.
- Similar to the OpenAI function, it sends a request to the Anthropic API (e.g.,
run_mistral()
:- This is a helper function that sends a request to the Mistral API to generate a summary.
- It accepts parameters such as the input text (wrapped in a summarization task), temperature, and max tokens.
mistral_summarize_text()
:- This function formats the input text as a summarization task for Mistral and calls
run_mistral()
to generate the summary.
- This function formats the input text as a summarization task for Mistral and calls
3. Aggregated Summarization:¶
summarize_text_with_all_models()
:- This function iterates over the models from OpenAI, Anthropic, and Mistral to generate a summary for the same input text.
- It stores the generated summaries from each model in a dictionary, with the keys being the model names and the values being the summaries.
- The function processes multiple models from each API provider and gathers their outputs.
4. Main Program Execution:¶
API Keys and Input Text:
- The API keys are retrieved from the environment variables, and the text to be summarized is defined (in this case, about the impact of climate change on polar bears).
Model Lists:
- Lists of models for OpenAI (
gpt-3.5-turbo
,gpt-4
), Anthropic (claude-3-5-sonnet
,claude-3-opus
), and Mistral (open-mistral-7b
,mistral-medium-latest
) are specified for evaluation.
- Lists of models for OpenAI (
Generating Summaries:
- The function
summarize_text_with_all_models()
is called to generate summaries from each model for the given text.
- The function
Results Output:
- The script prints the results, including the model name, word count of the summary, and the summarized text.
Purpose:¶
This script allows users to compare the output of various models from OpenAI, Anthropic, and Mistral when tasked with summarizing the same piece of text. It's helpful for benchmarking and evaluating different models' performance in summarization tasks.
In [1]:
import os
from dotenv import load_dotenv
import requests
import json
# Charger les variables d'environnement
load_dotenv()
Out[1]:
In [2]:
def openai_summarize_text(api_key, text, model="gpt-4", temperature=0.7, max_tokens=1024, stop=None):
"""
Summarizes a given text using the OpenAI API.
"""
task_description = "Summarize the following text."
prompt_content = f"""
{task_description}
Text: {text}
Summary:
"""
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
data = {
"model": model,
"messages": [
{"role": "user", "content": prompt_content}
],
"temperature": temperature,
"max_tokens": max_tokens
}
if stop:
data["stop"] = stop
response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, data=json.dumps(data))
if response.status_code == 200:
response_json = response.json()
generated_summary = response_json["choices"][0]["message"]["content"].strip()
return generated_summary
else:
return f"Error {response.status_code}: {response.text}"
In [3]:
def anthropic_summarize_text(api_key, text, model="claude-3-5-sonnet-20240620", max_tokens=1024, temperature=0.7):
"""
Summarizes a given text using the Anthropic API.
"""
url = "https://api.anthropic.com/v1/messages"
headers = {
"x-api-key": api_key,
"anthropic-version": "2023-06-01",
"content-type": "application/json"
}
data = {
"model": model,
"max_tokens": max_tokens,
"temperature": temperature,
"messages": [
{"role": "user", "content": f"Please summarize the following text:\n\n{text}"}
]
}
response = requests.post(url, headers=headers, data=json.dumps(data))
if response.status_code == 200:
response_json = response.json()
generated_text = response_json["content"][0]["text"].strip()
return generated_text
else:
return f"Error {response.status_code}: {response.text}"
In [4]:
def run_mistral(api_key, user_message, model="mistral-medium-latest"):
url = "https://api.mistral.ai/v1/chat/completions"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
data = {
"model": model,
"messages": [
{"role": "user", "content": user_message}
],
"temperature": 0.7,
"top_p": 1.0,
"max_tokens": 512,
"stream": False,
"safe_prompt": False,
"random_seed": 1337
}
response = requests.post(url, headers=headers, data=json.dumps(data))
if response.status_code == 200:
response_json = response.json()
return response_json["choices"][0]["message"]["content"].strip()
else:
return f"Error {response.status_code}: {response.text}"
def mistral_summarize_text(api_key, text, model="mistral-medium-latest"):
"""
Summarizes a given text using the Mistral API.
"""
user_message = f"Please summarize the following text:\n\n{text}"
return run_mistral(api_key, user_message, model=model)
In [5]:
# Fonction globale pour résumer du texte avec les modèles des trois APIs
def summarize_text_with_all_models(openai_key, anthropic_key, mistral_key, text, openai_models, anthropic_models, mistral_models, temperature=0.7, max_tokens=100, stop=None):
results = {}
# Résumer du texte avec tous les modèles OpenAI
for model in openai_models:
openai_result = openai_summarize_text(openai_key, text, model, temperature, max_tokens, stop)
results[f'openai_{model}'] = openai_result
# Résumer du texte avec tous les modèles Anthropic
for model in anthropic_models:
anthropic_result = anthropic_summarize_text(anthropic_key, text, model, max_tokens, temperature)
results[f'anthropic_{model}'] = anthropic_result
# Résumer du texte avec tous les modèles Mistral
for model in mistral_models:
mistral_result = mistral_summarize_text(mistral_key, text, model)
results[f'mistral_{model}'] = mistral_result
return results
In [6]:
if __name__ == "__main__":
openai_key = os.getenv("OPENAI_API_KEY")
anthropic_key = os.getenv("ANTHROPIC_API_KEY")
mistral_key = os.getenv("MISTRAL_API_KEY")
text_to_summarize = "Polar bears are increasingly threatened by climate change. As the Arctic ice melts, their habitat shrinks, making it difficult for them to hunt seals, their primary food source. This leads to malnutrition and decreased reproduction rates. Conservation efforts are crucial to mitigate these effects and protect polar bear populations."
openai_models = ["gpt-3.5-turbo", "gpt-4", "gpt-4-turbo", "gpt-4o-mini", "gpt-4o"]
anthropic_models = ["claude-3-5-sonnet-20240620", "claude-3-opus-20240229", "claude-3-sonnet-20240229", "claude-3-haiku-20240307"]
mistral_models = ["open-mistral-7b", "open-mixtral-8x7b", "open-mixtral-8x22b", "mistral-small-latest", "mistral-medium-latest", "mistral-large-latest"]
results = summarize_text_with_all_models(openai_key, anthropic_key, mistral_key, text_to_summarize, openai_models, anthropic_models, mistral_models)
for model_name, result in results.items():
word_count = len(result.split())
print(f"\033[1mResult from {model_name} ({word_count} words):\033[0m\n{result}\n")