Text Generation
Overview
This guide explains how to use the AI.ML Text API for chat-based completions, summarisation, and sentiment analysis.
It supports models like llama-3.1-8b-instruct and others via the /openai/chat/completions endpoint.
Base URL: https://api.ai.ml
Auth: Bearer token via header:
Authorization: Bearer <AIML_API_KEY>
Text API
Endpoint
POST /openai/chat/completions
Request Body (JSON)
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| model | string | ✓ | — | Model name, e.g., "llama-3.1-8b-instruct". Link to all models <link to model list> |
| messages | array | ✓ | — | Conversation history: list of { role, content } objects. Roles: system, user, assistant. |
| temperature | number | ✕ | 1 | Controls randomness (0 = deterministic, >1 = more creative). |
| max_tokens | integer | ✕ | 2048 | Maximum tokens to generate. |
| top_p | number | ✕ | 1 | Nucleus sampling: considers tokens with cumulative probability ≤ top_p. |
| frequency_penalty | number | ✕ | 0 | Penalizes repetition of phrases. Range: -2 to 2. |
| presence_penalty | number | ✕ | 0 | Encourages new topics. Range: -2 to 2. |
| stop | string[] | ✕ | — | List of stop sequences to cut off output. |
| stream | boolean | ✕ | false | If true, streams tokens as they’re generated. |
Example Usage
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.AIML_API_KEY,
baseURL: "https://api.ai.ml/openai",
});
async function main() {
const completion = await client.chat.completions.create({
model: "llama-3.1-8b-instruct",
messages: [
{ role: "system", content: "You are a helpful assistant of user" },
{ role: "user", content: "Explain quantum computing in simple terms." }
],
temperature: 0.7,
max_tokens: 500,
});
console.log(completion.choices[0].message.content);
}
main();Building Chat Application
To build a chat application:
-
Maintain conversation history in
messages. -
Alternate roles:
system→user→assistant. -
Send the entire history on each API call for context.
Example:
"messages": [
{ "role": "system", "content": "You are a customer support bot." },
{ "role": "user", "content": "I can’t log into my account." },
{ "role": "assistant", "content": "Have you tried resetting your password?" },
{ "role": "user", "content": "Yes, but it didn’t work." }
]The assistant’s next reply will use this context.
Text Summarisation
Summarisation is done by giving summarisation instructions in the user message.
Example:
"messages": [
{ "role": "system", "content": "You are a summarization assistant. Summarize the news article in 3 bullet points" },
{ "role": "user", "content": "Summarize the following: <paste long article here>" }
]Sentiment Analysis
Sentiment analysis is also prompt-driven.
Ask the model to classify sentiment as Positive, Negative, or Neutral.
Example:
"messages": [
{ "role": "system", "content": "You are a sentiment analysis assistant. Classify the reviews given as Positive, Negative, or Neutral" },
{ "role": "user", "content": "The product is amazing, I loved the fast delivery!" }
]Response: Positive
Notes & Best Practices
-
Always specify a system role to set behavior (“You are a legal summarisation assistant”, etc.). For best practices in prompting refer to our detailed Guidelines for Text Generation reference
-
Use
temperature=0for deterministic outputs (good for analysis, classification). -
Use
temperature≈0.7for creative writing and conversation. -
Keep
max_tokensaligned with your use case (short answers vs long summaries). -
Sentiment/summary tasks work best when you clearly instruct the model.
-
Refer to our detailed Guideline for Text generation reference