D1 02 Prompt templates and parsing

Prompt Templates, Few-Shot Learning & Output Parsing¶

This notebook demonstrates how to use LangChain's Prompt Templates, few-shot prompting techniques, and structured output parsing with a local open-source language model.

We will use the instruction-tuned model "NousResearch/Nous-Hermes-2-Mistral-7B-DPO" throughout, and explore:

Prompt templates for reusability and clarity
Few-shot prompting to guide the model with examples
Structured output parsing using Pydantic

Bazzite-AI Setup Required
Run D0_00_Bazzite_AI_Setup.ipynb first to configure Ollama and verify GPU access.

Prompt Templates in LangChain¶

LangChain’s PromptTemplate lets you define reusable prompt structures with placeholders for dynamic input.

from langchain_core.prompts import PromptTemplate

prompt = PromptTemplate(
    input_variables=["topic"],
    template="Explain the following topic in simple terms:

{topic}"
)

print(prompt.format(topic="What is machine learning?"))

This is helpful for keeping prompts clean and consistent across inputs.

In [140]:

  Copied!     
 
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline
from langchain_huggingface.llms import HuggingFacePipeline
from langchain_huggingface import ChatHuggingFace
from langchain_core.prompts import PromptTemplate
from langchain_core.prompts.chat import SystemMessagePromptTemplate, HumanMessagePromptTemplate, ChatPromptTemplate, AIMessagePromptTemplate
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage
from langchain_core.output_parsers import PydanticOutputParser
from pydantic import BaseModel
from typing import List
import json
import re
import os
import torch from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline from langchain_huggingface.llms import HuggingFacePipeline from langchain_huggingface import ChatHuggingFace from langchain_core.prompts import PromptTemplate from langchain_core.prompts.chat import SystemMessagePromptTemplate, HumanMessagePromptTemplate, ChatPromptTemplate, AIMessagePromptTemplate from langchain_core.messages import SystemMessage, HumanMessage, AIMessage from langchain_core.output_parsers import PydanticOutputParser from pydantic import BaseModel from typing import List import json import re import os

Out[140]:

[No output generated]

In [141]:

  Copied!     
 
# Download model from HuggingFace (same base model as D1_01)
HF_LLM_MODEL = "NousResearch/Nous-Hermes-2-Mistral-7B-DPO"
# Download model from HuggingFace (same base model as D1_01) HF_LLM_MODEL = "NousResearch/Nous-Hermes-2-Mistral-7B-DPO"

Out[141]:

[No output generated]

In [142]:

  Copied!     
 
# 4-bit quantization config for efficient loading
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(HF_LLM_MODEL)

# Load model with 4-bit quantization
model = AutoModelForCausalLM.from_pretrained(
    HF_LLM_MODEL,
    device_map="auto",
    quantization_config=quantization_config,
)

# Create text generation pipeline
text_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    return_full_text=False,
    eos_token_id=tokenizer.eos_token_id,
)

llm = HuggingFacePipeline(pipeline=text_pipeline)
# 4-bit quantization config for efficient loading quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 ) # Load tokenizer tokenizer = AutoTokenizer.from_pretrained(HF_LLM_MODEL) # Load model with 4-bit quantization model = AutoModelForCausalLM.from_pretrained( HF_LLM_MODEL, device_map="auto", quantization_config=quantization_config, ) # Create text generation pipeline text_pipeline = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=512, do_sample=True, temperature=0.7, return_full_text=False, eos_token_id=tokenizer.eos_token_id, ) llm = HuggingFacePipeline(pipeline=text_pipeline)

Out[142]:

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Out[142]:

Device set to use cuda:0

In [143]:

  Copied!     
 
prompt_template = PromptTemplate(
    input_variables=["topic"],
    template="Explain the following topic in simple terms:\n\n{topic}"
)

print(prompt_template.format(topic="What is machine learning?"))
prompt_template = PromptTemplate( input_variables=["topic"], template="Explain the following topic in simple terms:\n\n{topic}" ) print(prompt_template.format(topic="What is machine learning?"))

Out[143]:

Explain the following topic in simple terms:

What is machine learning?

In [144]:

  Copied!     
 
response = llm.invoke(prompt_template.format(topic="What is machine learning?"))
print(response)
response = llm.invoke(prompt_template.format(topic="What is machine learning?")) print(response)

Out[144]:


Machine learning is a subset of AI (Artificial Intelligence) that involves the development of algorithms and models that can learn from data and then make predictions and decisions without being explicitly programmed. It enables systems to improve and adapt their performance based on new data.

In simpler terms, machine learning is the ability of a computer system or algorithm to improve its performance at a specific task over time, without being explicitly programmed to do so. This is achieved by using statistical techniques to analyze large datasets, identify patterns, and make predictions and decisions based on this analysis.

For example, an algorithm trained on a large dataset of images can learn to recognize and classify objects within those images, such as detecting and identifying different animals or vehicles. This ability to learn and adapt on its own makes machine learning a powerful tool for solving complex problems and making decisions based on data.

Let's have a look at another example:

In [145]:

  Copied!     
 
simplify_prompt = PromptTemplate(
    input_variables=["clause"],
    template="""
You are a legal assistant that simplifies complex legal clauses into plain, understandable English.

Clause:
{clause}

Simplified Explanation:
"""
)
simplify_prompt = PromptTemplate( input_variables=["clause"], template=""" You are a legal assistant that simplifies complex legal clauses into plain, understandable English. Clause: {clause} Simplified Explanation: """ )

Out[145]:

[No output generated]

In [146]:

  Copied!     
 
legal_clause = (
    "The lessee shall indemnify and hold harmless the lessor from any liabilities, damages, "
    "or claims arising out of the use of the premises, except in cases of gross negligence."
)

formatted_prompt = simplify_prompt.format(clause=legal_clause)
response = llm.invoke(formatted_prompt)

print("Simplified:\n", response)
legal_clause = ( "The lessee shall indemnify and hold harmless the lessor from any liabilities, damages, " "or claims arising out of the use of the premises, except in cases of gross negligence." ) formatted_prompt = simplify_prompt.format(clause=legal_clause) response = llm.invoke(formatted_prompt) print("Simplified:\n", response)

Out[146]:

Simplified:
 The person who rents the property (lessee) must protect and compensate the property owner (lessor) from any losses, harm, or legal issues that may arise from using the property, except in cases where a very serious lack of care caused the problem.

In [147]:

  Copied!     
 
# Ollama configuration (no API key needed!)
OLLAMA_HOST = os.getenv("OLLAMA_HOST", "http://ollama:11434")

# === Model Configuration ===
HF_LLM_MODEL = "NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF"
OLLAMA_LLM_MODEL = f"hf.co/{HF_LLM_MODEL}:Q4_K_M"

print(f"Ollama host: {OLLAMA_HOST}")
print(f"Model: {OLLAMA_LLM_MODEL}")
# Ollama configuration (no API key needed!) OLLAMA_HOST = os.getenv("OLLAMA_HOST", "http://ollama:11434") # === Model Configuration === HF_LLM_MODEL = "NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF" OLLAMA_LLM_MODEL = f"hf.co/{HF_LLM_MODEL}:Q4_K_M" print(f"Ollama host: {OLLAMA_HOST}") print(f"Model: {OLLAMA_LLM_MODEL}")

Out[147]:

Ollama host: http://ollama:11434
Model: hf.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF:Q4_K_M

In [148]:

  Copied!     
 
from langchain_openai import ChatOpenAI

# Use Ollama as OpenAI-compatible endpoint (no API key required)
llm_ollama = ChatOpenAI(
    base_url=f"{OLLAMA_HOST}/v1",
    api_key="ollama",  # Ollama ignores this but LangChain requires it
    model=OLLAMA_LLM_MODEL,
    temperature=0.7,
    max_tokens=512
)
from langchain_openai import ChatOpenAI # Use Ollama as OpenAI-compatible endpoint (no API key required) llm_ollama = ChatOpenAI( base_url=f"{OLLAMA_HOST}/v1", api_key="ollama", # Ollama ignores this but LangChain requires it model=OLLAMA_LLM_MODEL, temperature=0.7, max_tokens=512 )

Out[148]:

[No output generated]

In [149]:

  Copied!     
 
response = llm_ollama.invoke(formatted_prompt)

print("Simplified:\n", response)
response = llm_ollama.invoke(formatted_prompt) print("Simplified:\n", response)

Out[149]:

Simplified:
 content="The person renting the property (lessee) promises to protect and not hold responsible the owner of the property (lessor) for any issues or problems that might happen because of using the place, unless it's something very careless." additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 50, 'prompt_tokens': 95, 'total_tokens': 145, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_provider': 'openai', 'model_name': 'hf.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF:Q4_K_M', 'system_fingerprint': 'fp_ollama', 'id': 'chatcmpl-275', 'finish_reason': 'stop', 'logprobs': None} id='lc_run--019b66a9-c488-7a40-b47e-9c287eaf341e-0' usage_metadata={'input_tokens': 95, 'output_tokens': 50, 'total_tokens': 145, 'input_token_details': {}, 'output_token_details': {}}

Few Shot Prompt Template¶

Let's go over an example where you want a historical conversation to show the LLM Chat Bot a few examples, known as "Few Shot Prompts". We essentially provide some examples before sending the message history to the LLM. Be careful not to make the entire message too long, as you may hit context limits (but the latest models have quite large contexsts.

LangChain distinguishes between:

PromptTemplates for simple string prompts
MessagePromptTemplates for structured chat-style prompts using roles like system, user, assistant

So SystemMessagePromptTemplate helps build structured prompts that work with ChatModels

Creating Example Inputs and Outputs

In [150]:

  Copied!     
 
template = "You are a helpful assistant that translates complex legal terms into plain and understandable language."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
template = "You are a helpful assistant that translates complex legal terms into plain and understandable language." system_message_prompt = SystemMessagePromptTemplate.from_template(template)

Out[150]:

[No output generated]

In [151]:

  Copied!     
 
legal_text_1 = "Notwithstanding any provision to the contrary herein, the indemnitor agrees to indemnify, defend, and hold harmless the indemnitee from and against any and all claims, liabilities, damages, or expenses (including, without limitation, reasonable attorney’s fees) arising out of or related to the indemnitor’s acts or omissions, except to the extent that such claims, liabilities, damages, or expenses result from the gross negligence or willful misconduct of the indemnitee."
example_input_1 = HumanMessagePromptTemplate.from_template(legal_text_1)

plain_text_1 = "One party agrees to cover any costs, claims, or damages that happen because of their actions, including legal fees. However, they do not have to pay if the other party was extremely careless or acted intentionally wrong."
example_output_1 = AIMessagePromptTemplate.from_template(plain_text_1)

legal_text_2 = "This agreement shall be binding upon and inure to the benefit of the parties hereto and their respective heirs, executors, administrators, successors, and assigns, and shall not be assignable by either party without the prior written consent of the other, except that either party may assign its rights and obligations hereunder in connection with a merger, consolidation, or sale of substantially all of its assets."
example_input_2 = HumanMessagePromptTemplate.from_template(legal_text_2)

plain_text_2 = "This agreement applies to both parties and their future representatives, such as heirs or business successors. Neither party can transfer their rights under this agreement to someone else unless they get written permission. However, if one party merges with another company or sells most of its assets, they can transfer their rights without permission."
example_output_2 = AIMessagePromptTemplate.from_template(plain_text_2)
legal_text_1 = "Notwithstanding any provision to the contrary herein, the indemnitor agrees to indemnify, defend, and hold harmless the indemnitee from and against any and all claims, liabilities, damages, or expenses (including, without limitation, reasonable attorney’s fees) arising out of or related to the indemnitor’s acts or omissions, except to the extent that such claims, liabilities, damages, or expenses result from the gross negligence or willful misconduct of the indemnitee." example_input_1 = HumanMessagePromptTemplate.from_template(legal_text_1) plain_text_1 = "One party agrees to cover any costs, claims, or damages that happen because of their actions, including legal fees. However, they do not have to pay if the other party was extremely careless or acted intentionally wrong." example_output_1 = AIMessagePromptTemplate.from_template(plain_text_1) legal_text_2 = "This agreement shall be binding upon and inure to the benefit of the parties hereto and their respective heirs, executors, administrators, successors, and assigns, and shall not be assignable by either party without the prior written consent of the other, except that either party may assign its rights and obligations hereunder in connection with a merger, consolidation, or sale of substantially all of its assets." example_input_2 = HumanMessagePromptTemplate.from_template(legal_text_2) plain_text_2 = "This agreement applies to both parties and their future representatives, such as heirs or business successors. Neither party can transfer their rights under this agreement to someone else unless they get written permission. However, if one party merges with another company or sells most of its assets, they can transfer their rights without permission." example_output_2 = AIMessagePromptTemplate.from_template(plain_text_2)

Out[151]:

[No output generated]

In [152]:

  Copied!     
 
human_template = "{legal_text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
human_template = "{legal_text}" human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)

Out[152]:

[No output generated]

In [153]:

  Copied!     
 
chat_prompt = ChatPromptTemplate.from_messages(
    [system_message_prompt, example_input_1, example_output_1, example_input_2, example_output_2, human_message_prompt]
)
chat_prompt = ChatPromptTemplate.from_messages( [system_message_prompt, example_input_1, example_output_1, example_input_2, example_output_2, human_message_prompt] )

Out[153]:

[No output generated]

In [154]:

  Copied!     
 
some_example_text = "Any waiver of any term or condition of this agreement shall not be deemed a continuing waiver of such term or condition, nor shall it be considered a waiver of any other term or condition hereof. No failure or delay by either party in exercising any right, power, or privilege under this agreement shall operate as a waiver thereof, nor shall any single or partial exercise preclude any other or further exercise thereof or the exercise of any other right, power, or privilege."
request = chat_prompt.format_prompt(legal_text=some_example_text).to_messages()
some_example_text = "Any waiver of any term or condition of this agreement shall not be deemed a continuing waiver of such term or condition, nor shall it be considered a waiver of any other term or condition hereof. No failure or delay by either party in exercising any right, power, or privilege under this agreement shall operate as a waiver thereof, nor shall any single or partial exercise preclude any other or further exercise thereof or the exercise of any other right, power, or privilege." request = chat_prompt.format_prompt(legal_text=some_example_text).to_messages()

Out[154]:

[No output generated]

In [155]:

  Copied!     
 
result = llm_ollama.invoke(request)
result = llm_ollama.invoke(request)

Out[155]:

[No output generated]

In [156]:

  Copied!     
 
print(result)
print(result)

Out[156]:

content="If one side makes an exception to any part of this agreement, it doesn't mean they're giving up on that part forever or that they'll ignore other parts of the agreement. Also, if one side doesn't act on time about their rights, powers, or privileges under the agreement, it doesn't mean they won't be able to do so later or that they'll give up any other rights." additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 90, 'prompt_tokens': 479, 'total_tokens': 569, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_provider': 'openai', 'model_name': 'hf.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF:Q4_K_M', 'system_fingerprint': 'fp_ollama', 'id': 'chatcmpl-464', 'finish_reason': 'stop', 'logprobs': None} id='lc_run--019b66a9-ee4f-7900-836e-ef947f81c371-0' usage_metadata={'input_tokens': 479, 'output_tokens': 90, 'total_tokens': 569, 'input_token_details': {}, 'output_token_details': {}}

Parsing output¶

Large language models (LLMs) typically generate free-form text, which is great for human conversation — but not ideal when we want to extract specific information or automate downstream tasks.

The Problem Imagine asking an LLM:

"Summarize this contract and give me the parties involved, the start date, and any penalties."

If the model responds with a long paragraph, it becomes difficult to:

Reliably extract the pieces you need
Validate whether the answer is complete
Feed the output into another system

The Solution Structured Output (e.g. JSON) By instructing the LLM to return data in a structured format like JSON, we can:

Parse the output automatically, although this does not always work
Validate that required fields are present
Integrate with other tools and code seamlessly

Structured output turns the LLM into a more reliable component of your application.
Parsing with tools like Pydantic ensures your data is clean, complete, and ready for automation.

Define format Let's first see if we can get the output in form of a JSON object, by adding that request to the system prompt:

In [157]:

  Copied!     
 
template = "You are a helpful assistant that translates complex legal terms into plain and understandable language.  Respond only with a JSON object containing a single key 'translation' and its corresponding value."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
template = "You are a helpful assistant that translates complex legal terms into plain and understandable language. Respond only with a JSON object containing a single key 'translation' and its corresponding value." system_message_prompt = SystemMessagePromptTemplate.from_template(template)

Out[157]:

[No output generated]

In [158]:

  Copied!     
 
chat_prompt = ChatPromptTemplate.from_messages(
    [system_message_prompt, example_input_1, example_output_1, example_input_2, example_output_2, human_message_prompt]
)
chat_prompt = ChatPromptTemplate.from_messages( [system_message_prompt, example_input_1, example_output_1, example_input_2, example_output_2, human_message_prompt] )

Out[158]:

[No output generated]

In [159]:

  Copied!     
 
some_example_text = "Any waiver of any term or condition of this agreement shall not be deemed a continuing waiver of such term or condition, nor shall it be considered a waiver of any other term or condition hereof. No failure or delay by either party in exercising any right, power, or privilege under this agreement shall operate as a waiver thereof, nor shall any single or partial exercise preclude any other or further exercise thereof or the exercise of any other right, power, or privilege."
request = chat_prompt.format_prompt(legal_text=some_example_text).to_messages()
some_example_text = "Any waiver of any term or condition of this agreement shall not be deemed a continuing waiver of such term or condition, nor shall it be considered a waiver of any other term or condition hereof. No failure or delay by either party in exercising any right, power, or privilege under this agreement shall operate as a waiver thereof, nor shall any single or partial exercise preclude any other or further exercise thereof or the exercise of any other right, power, or privilege." request = chat_prompt.format_prompt(legal_text=some_example_text).to_messages()

Out[159]:

[No output generated]

In [160]:

  Copied!     
 
result = llm_ollama.invoke(request)
result = llm_ollama.invoke(request)

Out[160]:

[No output generated]

In [161]:

  Copied!     
 
result
result

Out[161]:

AIMessage(content="A waiver of any part of this agreement doesn't mean that the same thing can be waived later or that all other parts are waived too. If one party doesn't use their rights, it doesn't mean they give up those rights permanently or can't use them in the future.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 65, 'prompt_tokens': 499, 'total_tokens': 564, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_provider': 'openai', 'model_name': 'hf.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF:Q4_K_M', 'system_fingerprint': 'fp_ollama', 'id': 'chatcmpl-87', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019b66aa-155d-7603-b118-7600620aaa82-0', usage_metadata={'input_tokens': 499, 'output_tokens': 65, 'total_tokens': 564, 'input_token_details': {}, 'output_token_details': {}})

That clearly didn't do the trick.

Pydantic¶

Pydantic is a Python library for defining data models with validation. With LangChain, it allows you to:

Define the structure you expect from the model
Automatically parse the raw LLM output
Catch errors if fields are missing or malformed

Example

Define a Pydantic model

from pydantic import BaseModel
from typing import List

class ClauseSummary(BaseModel):
    parties: List[str]
    start_date: str
    penalty_clause: str

This defines the structure we want the LLM to return — a JSON object with:

A list of parties
A start_date
A penalty_clause string

Set up a parser using LangChain

from langchain_core.output_parsers import PydanticOutputParser

parser = PydanticOutputParser(pydantic_object=ClauseSummary)

This parser will take a raw string (from the LLM) and try to convert it into a ClauseSummary object.

Include the schema in the system prompt

format_instructions = parser.get_format_instructions()

prompt = PromptTemplate(
    input_variables=["clause", "format_instructions"],
    template="""
Extract the following fields from the contract clause below and return them in **valid JSON format ONLY**, with no extra text or explanation.

Clause:
{clause}

{format_instructions}
"""
)

The format_instructions tells the LLM exactly what JSON structure to return, based on your Pydantic model.

Run the LLM and parse the output

response = llm.invoke(prompt)

try:
    parsed = parser.parse(response.content)
    print(parsed.dict())
except Exception as e:
    print("Could not parse output.")
    print("Raw response:", response.content)
    print(e)

If the model returns a correctly structured JSON string, you now get a real Python object with attributes you can use:

parsed.parties
parsed.start_date
parsed.penalty_clause

With this model, you can ensure the LLM responds in a way that fits your expected format — or fail gracefully when it doesn't.
Let's try out this example:

In [162]:

  Copied!     
 
# Define output structure
class ClauseSummary(BaseModel):
    parties: List[str]
    start_date: str
    penalty_clause: str
# Define output structure class ClauseSummary(BaseModel): parties: List[str] start_date: str penalty_clause: str

Out[162]:

[No output generated]

In [163]:

  Copied!     
 
# Set up parser
parser = PydanticOutputParser(pydantic_object=ClauseSummary)
format_instructions = parser.get_format_instructions()
# Set up parser parser = PydanticOutputParser(pydantic_object=ClauseSummary) format_instructions = parser.get_format_instructions()

Out[163]:

[No output generated]

In [164]:

  Copied!     
 
# Create prompt with correct input variables
prompt = PromptTemplate(
    input_variables=["clause", "format_instructions"],
    template="""
Extract the following fields from the contract clause below and return them in **valid JSON format ONLY**, with no extra text or explanation.

Clause:
{clause}

{format_instructions}
"""
)
# Create prompt with correct input variables prompt = PromptTemplate( input_variables=["clause", "format_instructions"], template=""" Extract the following fields from the contract clause below and return them in **valid JSON format ONLY**, with no extra text or explanation. Clause: {clause} {format_instructions} """ )

Out[164]:

[No output generated]

In [165]:

  Copied!     
 
# Clause to parse
clause_text = (
    "The agreement between Acme Corp and Beta LLC begins on January 1, 2025. "
    "If either party breaks the agreement, a €5,000 penalty applies."
)
# Clause to parse clause_text = ( "The agreement between Acme Corp and Beta LLC begins on January 1, 2025. " "If either party breaks the agreement, a €5,000 penalty applies." )

Out[165]:

[No output generated]

In [166]:

  Copied!     
 
# Format the full prompt
full_prompt = prompt.format(clause=clause_text, format_instructions=format_instructions)

# Run the model
response = llm_ollama.invoke(full_prompt)
# Format the full prompt full_prompt = prompt.format(clause=clause_text, format_instructions=format_instructions) # Run the model response = llm_ollama.invoke(full_prompt)

Out[166]:

[No output generated]

In [167]:

  Copied!     
 
# Parse the response
try:
    parsed = parser.parse(response.content)
    print(parsed.model_dump())
except Exception as e:
    print("Could not parse output.")
    print("Raw response:", response)
    print(e)
# Parse the response try: parsed = parser.parse(response.content) print(parsed.model_dump()) except Exception as e: print("Could not parse output.") print("Raw response:", response) print(e)

Out[167]:

{'parties': ['Acme Corp', 'Beta LLC'], 'start_date': 'January 1, 2025', 'penalty_clause': '€5,000'}

In [168]:

  Copied!     
 
print(parsed.parties)
print(parsed.start_date)
print(parsed.penalty_clause)
print(parsed.parties) print(parsed.start_date) print(parsed.penalty_clause)

Out[168]:

['Acme Corp', 'Beta LLC']
January 1, 2025
€5,000

Let's try that on the example which tried to simplify legal clauses and output them in JSON format.

In [169]:

  Copied!     
 
# Define output schema with Pydantic 
class LegalSimplification(BaseModel):
    translation: str

parser = PydanticOutputParser(pydantic_object=LegalSimplification)
# Define output schema with Pydantic class LegalSimplification(BaseModel): translation: str parser = PydanticOutputParser(pydantic_object=LegalSimplification)

Out[169]:

[No output generated]

In [170]:

  Copied!     
 
# Define the system prompt
format_instructions = parser.get_format_instructions()

system_message = SystemMessage(content=f"""You are a helpful assistant that translates complex legal terms into plain and understandable language.
Respond only in this format: {format_instructions}
Do not ask for clarification. Always use the given legal input.""")
# Define the system prompt format_instructions = parser.get_format_instructions() system_message = SystemMessage(content=f"""You are a helpful assistant that translates complex legal terms into plain and understandable language. Respond only in this format: {format_instructions} Do not ask for clarification. Always use the given legal input.""")

Out[170]:

[No output generated]

In [171]:

  Copied!     
 
# Define few-shot examples
examples = [
    {
        "input": "Notwithstanding any provision to the contrary herein, the indemnitor agrees to indemnify, defend, and hold harmless the indemnitee...",
        "output": "One party agrees to cover any costs, claims, or damages that happen because of their actions..."
    },
    {
        "input": "This agreement shall be binding upon and inure to the benefit of the parties...",
        "output": "This agreement applies to both parties and their future representatives..."
    }
]

few_shot_messages = []
for ex in examples:
    few_shot_messages.append(HumanMessage(content=ex["input"]))
    few_shot_messages.append(AIMessage(content=f'{{"translation": "{ex["output"]}"}}'))
# Define few-shot examples examples = [ { "input": "Notwithstanding any provision to the contrary herein, the indemnitor agrees to indemnify, defend, and hold harmless the indemnitee...", "output": "One party agrees to cover any costs, claims, or damages that happen because of their actions..." }, { "input": "This agreement shall be binding upon and inure to the benefit of the parties...", "output": "This agreement applies to both parties and their future representatives..." } ] few_shot_messages = [] for ex in examples: few_shot_messages.append(HumanMessage(content=ex["input"])) few_shot_messages.append(AIMessage(content=f'{{"translation": "{ex["output"]}"}}'))

Out[171]:

[No output generated]

In [172]:

  Copied!     
 
# Define the legal input text
legal_text = (
    "Any waiver of any term or condition of this agreement shall not be deemed a continuing waiver of such term "
    "or condition, nor shall it be considered a waiver of any other term or condition hereof. No failure or delay "
    "by either party in exercising any right, power, or privilege under this agreement shall operate as a waiver "
    "thereof, nor shall any single or partial exercise preclude any other or further exercise thereof or the "
    "exercise of any other right, power, or privilege."
)

user_message = HumanMessage(content=legal_text)
# Define the legal input text legal_text = ( "Any waiver of any term or condition of this agreement shall not be deemed a continuing waiver of such term " "or condition, nor shall it be considered a waiver of any other term or condition hereof. No failure or delay " "by either party in exercising any right, power, or privilege under this agreement shall operate as a waiver " "thereof, nor shall any single or partial exercise preclude any other or further exercise thereof or the " "exercise of any other right, power, or privilege." ) user_message = HumanMessage(content=legal_text)

Out[172]:

[No output generated]

In [173]:

  Copied!     
 
# Build full message list
messages = [system_message] + few_shot_messages + [user_message]

# Sanity check the prompt
print("\n\n===== Prompt Sent to Model =====")
for m in messages:
    print(f"{m.type.upper()}: {m.content}\n")
# Build full message list messages = [system_message] + few_shot_messages + [user_message] # Sanity check the prompt print("\n\n===== Prompt Sent to Model =====") for m in messages: print(f"{m.type.upper()}: {m.content}\n")

Out[173]:


===== Prompt Sent to Model =====
SYSTEM: You are a helpful assistant that translates complex legal terms into plain and understandable language.
Respond only in this format: The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"translation": {"title": "Translation", "type": "string"}}, "required": ["translation"]}
```
Do not ask for clarification. Always use the given legal input.

HUMAN: Notwithstanding any provision to the contrary herein, the indemnitor agrees to indemnify, defend, and hold harmless the indemnitee...

AI: {"translation": "One party agrees to cover any costs, claims, or damages that happen because of their actions..."}

HUMAN: This agreement shall be binding upon and inure to the benefit of the parties...

AI: {"translation": "This agreement applies to both parties and their future representatives..."}

HUMAN: Any waiver of any term or condition of this agreement shall not be deemed a continuing waiver of such term or condition, nor shall it be considered a waiver of any other term or condition hereof. No failure or delay by either party in exercising any right, power, or privilege under this agreement shall operate as a waiver thereof, nor shall any single or partial exercise preclude any other or further exercise thereof or the exercise of any other right, power, or privilege.

In [174]:

  Copied!     
 
# Create HF text-generation pipeline with lower temperature for parsing accuracy
# (temperature=0.6 here vs 0.7 in initial setup for more deterministic JSON output)
hf_pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
    return_full_text=False,
    eos_token_id=tokenizer.eos_token_id,
    skip_special_tokens=True,
)
# Create HF text-generation pipeline with lower temperature for parsing accuracy # (temperature=0.6 here vs 0.7 in initial setup for more deterministic JSON output) hf_pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=512, do_sample=True, temperature=0.6, top_p=0.9, return_full_text=False, eos_token_id=tokenizer.eos_token_id, skip_special_tokens=True, )

Out[174]:

Device set to use cuda:0

In [175]:

  Copied!     
 
# Wrap in LangChain-compatible Chat LLM
wrapped_llm = HuggingFacePipeline(pipeline=hf_pipe)
llm_chat = ChatHuggingFace(llm=wrapped_llm)
# Wrap in LangChain-compatible Chat LLM wrapped_llm = HuggingFacePipeline(pipeline=hf_pipe) llm_chat = ChatHuggingFace(llm=wrapped_llm)

Out[175]:

[No output generated]

In [176]:

  Copied!     
 
# Generate model output
raw_output = llm_chat.invoke(messages)
# Generate model output raw_output = llm_chat.invoke(messages)

Out[176]:

[No output generated]

In [177]:

  Copied!     
 
print(raw_output.content)
print(raw_output.content)

Out[177]:

{"translation": "Changing or ignoring any part of this agreement doesn't mean we ignore other parts or future changes. Delaying using any rights, powers, or privileges doesn't mean we can't use them later or other rights."}

In [178]:

  Copied!     
 
# Extract valid JSON from output

def extract_first_json(text):
    match = re.search(r'\{.*?\}', text, re.DOTALL)
    return match.group(0) if match else text.strip()

try:
    output_text = raw_output.content if isinstance(raw_output, AIMessage) else raw_output
    clean_output = extract_first_json(output_text)
    result = parser.parse(clean_output)

    print("\nSimplified translation:")
    print(result.translation)

    # Combine original and simplified
    entry = {
        "legal_text": legal_text,
        "translation": result.translation
    }

    # Load existing data if file exists
    data = []
    output_file = "simplified_output.json"
    if os.path.exists(output_file):
        with open(output_file, "r") as f:
            try:
                data = json.load(f)
                if not isinstance(data, list):
                    print("Warning: existing file is not a list. Overwriting.")
                    data = []
            except json.JSONDecodeError:
                data = []

    # Append new entry
    data.append(entry)

    # Write back to file
    with open(output_file, "w") as f:
        json.dump(data, f, indent=2)

    print(f"\nAppended to {output_file}")

except Exception as e:
    print("\nCould not parse output.")
    print("Raw output:", raw_output)
    print(e)
# Extract valid JSON from output def extract_first_json(text): match = re.search(r'\{.*?\}', text, re.DOTALL) return match.group(0) if match else text.strip() try: output_text = raw_output.content if isinstance(raw_output, AIMessage) else raw_output clean_output = extract_first_json(output_text) result = parser.parse(clean_output) print("\nSimplified translation:") print(result.translation) # Combine original and simplified entry = { "legal_text": legal_text, "translation": result.translation } # Load existing data if file exists data = [] output_file = "simplified_output.json" if os.path.exists(output_file): with open(output_file, "r") as f: try: data = json.load(f) if not isinstance(data, list): print("Warning: existing file is not a list. Overwriting.") data = [] except json.JSONDecodeError: data = [] # Append new entry data.append(entry) # Write back to file with open(output_file, "w") as f: json.dump(data, f, indent=2) print(f"\nAppended to {output_file}") except Exception as e: print("\nCould not parse output.") print("Raw output:", raw_output) print(e)

Out[178]:

Simplified translation:
Changing or ignoring any part of this agreement doesn't mean we ignore other parts or future changes. Delaying using any rights, powers, or privileges doesn't mean we can't use them later or other rights.

Appended to simplified_output.json

In [ ]:

In [179]:

  Copied!     
 
# === Unload Ollama Model & Shutdown Kernel ===
# Unloads the model from GPU memory before shutting down

try:
    import ollama
    print(f"Unloading Ollama model: {OLLAMA_LLM_MODEL}")
    ollama.generate(model=OLLAMA_LLM_MODEL, prompt="", keep_alive=0)
    print("Model unloaded from GPU memory")
except Exception as e:
    print(f"Model unload skipped: {e}")

# Shut down the kernel to fully release resources
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(restart=False)
# === Unload Ollama Model & Shutdown Kernel === # Unloads the model from GPU memory before shutting down try: import ollama print(f"Unloading Ollama model: {OLLAMA_LLM_MODEL}") ollama.generate(model=OLLAMA_LLM_MODEL, prompt="", keep_alive=0) print("Model unloaded from GPU memory") except Exception as e: print(f"Model unload skipped: {e}") # Shut down the kernel to fully release resources import IPython app = IPython.Application.instance() app.kernel.do_shutdown(restart=False)

Out[179]:

Unloading Ollama model: hf.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO-GGUF:Q4_K_M
Model unloaded from GPU memory

Out[179]:

{'status': 'ok', 'restart': False}