LLMs return JSON. Your RAG pipeline returns lists of retrieved chunks. Your evaluation harness processes arrays of model responses. Python's control flow and collection types are the primary tools you'll use to work with this data — and they're more expressive and compact than their C# equivalents in most AI use cases.
This lesson uses realistic LLM response structures to show how Python's loops, conditionals, and comprehensions handle the data manipulation patterns you'll encounter daily.
Python's Core Collection Types
Python has four built-in collection types. Each maps to a C# equivalent, but with important behavioral differences:
| Python | C# Equivalent | Mutable? | Ordered? | Unique Keys? |
|---|---|---|---|---|
list |
List<T> |
Yes | Yes | No |
tuple |
ValueTuple / record |
No | Yes | No |
dict |
Dictionary<K,V> |
Yes | Yes (3.7+) | Keys only |
set |
HashSet<T> |
Yes | No | Yes |
Working with Lists
Lists are the bread and butter of AI pipelines — storing retrieved chunks, model responses, message histories, and evaluation results.
# Simulated LLM batch response (realistic structure)
responses = [
{"id": "msg_1", "text": "Embeddings are dense vector representations...", "tokens": 42, "finish_reason": "end_turn"},
{"id": "msg_2", "text": "RAG stands for Retrieval-Augmented Generation...", "tokens": 38, "finish_reason": "end_turn"},
{"id": "msg_3", "text": None, "tokens": 0, "finish_reason": "max_tokens"},
{"id": "msg_4", "text": "Vector databases store high-dimensional...", "tokens": 55, "finish_reason": "end_turn"},
]
# List operations
print(len(responses)) # 4
print(responses[0]) # first item
print(responses[-1]) # last item
print(responses[1:3]) # slice: items at index 1 and 2
# Mutating
responses.append({"id": "msg_5", "text": "New response", "tokens": 10, "finish_reason": "end_turn"})
responses.pop(2) # remove by index (removes msg_3)
Control Flow: Conditionals and Loops
if / elif / else
Python's conditionals are close to C# but use indentation instead of braces, and elif instead of else if:
def classify_finish_reason(response: dict) -> str:
reason = response.get("finish_reason", "unknown")
if reason == "end_turn":
return "complete"
elif reason == "max_tokens":
return "truncated"
elif reason == "stop_sequence":
return "stopped"
else:
return f"unknown: {reason}"
for r in responses:
status = classify_finish_reason(r)
print(f"{r['id']}: {status}")
for Loops and range()
// Index-based
for (int i = 0; i < responses.Count; i++)
{
Console.WriteLine(responses[i]["id"]);
}
// For-each
foreach (var r in responses)
{
Console.WriteLine(r["id"]);
}
# Index-based (use range)
for i in range(len(responses)):
print(responses[i]["id"])
# For-each (preferred)
for r in responses:
print(r["id"])
# With index (preferred over range+len)
for i, r in enumerate(responses):
print(f"{i}: {r['id']}")
Useful Iteration Patterns for AI Code
# zip — pair two lists together
prompts = ["What is RAG?", "Explain embeddings", "What is a token?"]
model_outputs = ["RAG is...", "Embeddings are...", "A token is..."]
for prompt, output in zip(prompts, model_outputs):
print(f"Q: {prompt}\nA: {output}\n")
# Iterating dict items (extremely common with JSON responses)
response_meta = {"model": "claude-3-5-sonnet", "usage": {"input_tokens": 100, "output_tokens": 250}}
for key, value in response_meta.items():
print(f"{key}: {value}")
# Early exit with break
for r in responses:
if r.get("finish_reason") == "max_tokens":
print(f"Warning: {r['id']} was truncated!")
break # stop after first truncated response
List Comprehensions: Python's Superpower
List comprehensions are a concise, readable way to filter and transform collections. They're used everywhere in production AI code — in LINQ-style transformations, data cleaning, and result post-processing. Once you internalize them, you'll wonder how you lived without them.
// Filter complete responses
var complete = responses
.Where(r => r["finish_reason"] == "end_turn")
.ToList();
// Extract texts
var texts = responses
.Where(r => r["text"] != null)
.Select(r => (string)r["text"])
.ToList();
# Filter complete responses
complete = [
r for r in responses
if r["finish_reason"] == "end_turn"
]
# Extract texts (non-null only)
texts = [
r["text"] for r in responses
if r["text"] is not None
]
Practical Comprehension Patterns
# 1. Transform: extract and clean text from responses
cleaned_texts = [r["text"].strip() for r in responses if r["text"]]
# 2. Filter + transform: get high-token responses, normalized
long_responses = [
{"id": r["id"], "word_count": len(r["text"].split())}
for r in responses
if r["text"] and r["tokens"] > 40
]
# 3. Nested comprehension: flatten a list of chunk lists
retrieved_chunks = [["chunk1a", "chunk1b"], ["chunk2a"], ["chunk3a", "chunk3b", "chunk3c"]]
all_chunks = [chunk for chunk_list in retrieved_chunks for chunk in chunk_list]
# Result: ["chunk1a", "chunk1b", "chunk2a", "chunk3a", "chunk3b", "chunk3c"]
# 4. Dict comprehension: build a lookup map
response_map = {r["id"]: r["text"] for r in responses if r["text"]}
# Use: response_map["msg_1"] → "Embeddings are dense vector representations..."
Dictionaries: Working with LLM JSON
LLM APIs return JSON objects. In Python, these become dictionaries. Knowing the right dict access patterns prevents common bugs:
# Simulated Anthropic API response dict
api_response = {
"id": "msg_abc123",
"type": "message",
"role": "assistant",
"content": [{"type": "text", "text": "Here's the analysis..."}],
"model": "claude-3-5-sonnet-20241022",
"stop_reason": "end_turn",
"usage": {"input_tokens": 150, "output_tokens": 320},
}
# Safe access patterns
text = api_response["content"][0]["text"] # raises KeyError if missing
stop = api_response.get("stop_reason", "unknown") # returns default if missing
# Nested safe access
input_tokens = api_response.get("usage", {}).get("input_tokens", 0)
# Check key exists
if "usage" in api_response:
total = api_response["usage"]["input_tokens"] + api_response["usage"]["output_tokens"]
# Destructuring-style unpacking
model = api_response.get("model")
usage = api_response.get("usage", {})
input_t, output_t = usage.get("input_tokens", 0), usage.get("output_tokens", 0)
print(f"Model: {model} | Tokens: {input_t} in, {output_t} out")
Use d["key"] when the key must exist (loud failure is good). Use d.get("key", default) when the key is optional. Use d.setdefault("key", []) to initialize a key only if missing — useful for accumulating results into a dict of lists.
Sets for Deduplication
Sets are underused but invaluable in AI pipelines — deduplicating retrieved document IDs, tracking which chunks have been processed, and enforcing uniqueness in evaluation datasets.
# Deduplicating retrieved document IDs across multiple queries
query_results = [
["doc_001", "doc_002", "doc_005"],
["doc_002", "doc_003", "doc_005"],
["doc_001", "doc_004"],
]
# Flatten and deduplicate in one step
all_doc_ids = {doc_id for results in query_results for doc_id in results}
print(all_doc_ids) # {'doc_001', 'doc_002', 'doc_003', 'doc_004', 'doc_005'}
# Set operations — useful for evaluation
expected = {"doc_001", "doc_002", "doc_003"}
retrieved = {"doc_001", "doc_003", "doc_005"}
precision_hits = expected & retrieved # intersection: {'doc_001', 'doc_003'}
missed = expected - retrieved # difference: {'doc_002'}
extra = retrieved - expected # false positives: {'doc_005'}
recall = len(precision_hits) / len(expected)
print(f"Recall: {recall:.1%}") # 66.7%
Sorting, Filtering, and Aggregating
# Sort responses by token count (descending)
sorted_responses = sorted(
[r for r in responses if r["text"]],
key=lambda r: r["tokens"],
reverse=True,
)
# Group responses by finish reason using a dict
from collections import defaultdict
grouped: dict[str, list] = defaultdict(list)
for r in responses:
grouped[r["finish_reason"]].append(r)
print(dict(grouped))
# {'end_turn': [...], 'max_tokens': [...]}
# Aggregate stats
total_tokens = sum(r["tokens"] for r in responses)
avg_tokens = total_tokens / len(responses)
max_tokens_response = max(responses, key=lambda r: r["tokens"])
print(f"Total: {total_tokens} | Avg: {avg_tokens:.1f} | Max: {max_tokens_response['id']}")
The collections module has two must-know types for AI data work: defaultdict(list) for grouping without KeyError checks, and Counter for frequency counting (e.g., counting finish reasons, token distributions, or label occurrences in evaluation sets).
Key Takeaways
- Python's four core collection types map to C# generics:
list → List,dict → Dictionary,tuple → ValueTuple,set → HashSet - List comprehensions replace most LINQ chains — master them early, they're everywhere in AI code
- Use
dict.get("key", default)for safe access to JSON API responses; used["key"]when the field is required - Sets are your deduplication tool — use them for document ID tracking, evaluation recall/precision, and deduplicating retrieved chunks
enumerate()replaces index-basedfor i in range(len(...));zip()pairs multiple iterables cleanlycollections.defaultdictandCounterare the workhorses of grouping and frequency analysis in AI evaluation pipelines