Skip to main content

Python Dictionaries and JSON: Data Handling Mastery

June 16, 202513 min read
TechPythonAPI

What You'll Learn

  • Creating and manipulating Python dictionaries
  • Combining dictionaries with lists for complex data structures
  • Making HTTP requests and handling API responses
  • Reading and processing JSON data effectively
  • Best practices for working with external APIs

Python dictionaries and JSON handling are essential skills for modern development. Whether you're building web applications, processing API data, or managing configuration files, mastering these concepts will significantly improve your ability to work with structured data.

In my experience building AI systems and integrating with various APIs at Bell Canada, I've found that understanding dictionaries and JSON processing is crucial for handling real-world data efficiently. Let's explore these powerful Python features.

1. Python Dictionaries Fundamentals

Dictionaries are Python's implementation of key-value data structures. They provide fast lookups and are perfect for storing related data that needs to be accessed by meaningful identifiers rather than numeric indices.

Creating and Defining Dictionaries

# Empty dictionary
acronyms = {}

# Dictionary with initial data
acronyms = {
    "LOL": "Laugh out loud",
    "IDK": "I don't know", 
    "IMY": "I miss you",
    "BRB": "Be right back"
}

# Alternative creation method
user_data = dict(name="John", age=30, city="Toronto")

💡 Key Concept: Dictionary keys must be immutable (strings, numbers, tuples), while values can be any Python object including lists, other dictionaries, or custom objects.

Accessing and Modifying Dictionary Data

# Accessing values by key
print(acronyms["LOL"])  # Output: "Laugh out loud"

# Adding new items
acronyms["TTYL"] = "Talk to you later"
acronyms["OMG"] = "Oh my god"

# Updating existing items
acronyms["LOL"] = "Lots of love"  # Updates the value

# Removing items
del acronyms["IDK"]  # Removes the key-value pair

print(acronyms)
# Output: {'LOL': 'Lots of love', 'IMY': 'I miss you', 'BRB': 'Be right back', 'TTYL': 'Talk to you later', 'OMG': 'Oh my god'}

Safe Dictionary Access with get()

The get() method provides a safe way to access dictionary values without raising KeyError exceptions:

# Safe access with get()
definition = acronyms.get("BTW")  # Returns None if key doesn't exist
print(definition)  # Output: None

# Providing default values
definition = acronyms.get("BTW", "Not found")
print(definition)  # Output: "Not found"

# Unsafe access (can raise KeyError)
try:
    definition = acronyms["BTW"]  # This will raise KeyError
except KeyError:
    print("Key 'BTW' not found in dictionary")

# Checking if key exists
if "LOL" in acronyms:
    print(f"LOL means: {acronyms['LOL']}")

2. Combining Lists and Dictionaries

Real-world applications often require complex data structures. Combining lists and dictionaries allows you to model sophisticated relationships and hierarchical data.

Dictionaries with List Values

# Restaurant menu example
menus = {
    'Breakfast': ['Egg Sandwich', 'Bagel', 'Coffee', 'Pancakes'],
    'Lunch': ['BLT', 'PB&J', 'Turkey Sandwich', 'Caesar Salad'],
    'Dinner': ['Steak', 'Salmon', 'Pasta', 'Vegetarian Bowl']
}

# Accessing nested data
breakfast_items = menus['Breakfast']
first_breakfast_item = menus['Breakfast'][0]  # "Egg Sandwich"

# Adding items to existing lists
menus['Breakfast'].append('French Toast')
menus['Lunch'].extend(['Club Sandwich', 'Soup'])

print(f"Breakfast options: {len(menus['Breakfast'])}")  # Output: 5

Iterating Through Complex Structures

# Iterating through keys only (default behavior)
for meal_type in menus:
    print(f"We serve {meal_type}")

# Iterating through keys and values
for meal_type, items in menus.items():
    print(f"
{meal_type} Menu:")
    for i, item in enumerate(items, 1):
        print(f"  {i}. {item}")

# Iterating through values only
for menu_items in menus.values():
    print(f"This menu has {len(menu_items)} items")

Lists of Dictionaries

# Employee database example
employees = [
    {"name": "Alice", "department": "Engineering", "salary": 95000, "years": 3},
    {"name": "Bob", "department": "Marketing", "salary": 75000, "years": 2},
    {"name": "Charlie", "department": "Engineering", "salary": 105000, "years": 5},
    {"name": "Diana", "department": "Sales", "salary": 85000, "years": 4}
]

# Processing list of dictionaries
total_salary = sum(emp["salary"] for emp in employees)
avg_salary = total_salary / len(employees)

# Filtering data
engineers = [emp for emp in employees if emp["department"] == "Engineering"]
senior_employees = [emp for emp in employees if emp["years"] >= 4]

print(f"Average salary: {avg_salary:.2f}")
print(f"Number of engineers: {'{len(engineers)}'}")
print(f"Senior employees: {'{[emp["name"] for emp in senior_employees]}'}")

🎯 Real-World Application: At Bell Canada, we use similar nested structures to manage customer data, service configurations, and API responses from multiple systems.

3. Understanding JSON and APIs

JSON (JavaScript Object Notation) is the standard format for data exchange between web services. Understanding how to work with JSON is essential for modern application development.

What is JSON?

JSON Characteristics:

  • Lightweight, text-based data interchange format
  • Language-independent but uses conventions familiar to programmers
  • Built on key-value pairs (like Python dictionaries)
  • Supports strings, numbers, booleans, arrays, and nested objects
  • Widely used for API responses and configuration files

Making HTTP Requests with Python

The requests library is the standard tool for making HTTP requests in Python:

# First, install the requests library
# pip install requests

import requests
import json

# Making a GET request to a public API
response = requests.get('http://api.open-notify.org/astros.json')

# Check if request was successful
if response.status_code == 200:
    # Parse JSON response
    data = response.json()
    
    print("People currently in space:")
    print(f"Total: {data['number']}")
    
    for person in data['people']:
        print(f"- {person['name']} on {person['craft']}")
else:
    print(f"Error: {response.status_code}")

Working with API Keys

# Example with weather API (hypothetical)
import requests
import os

# Store API keys securely (use environment variables)
API_KEY = os.getenv('WEATHER_API_KEY', 'your-api-key-here')
BASE_URL = 'https://api.weather.com/v1/current'

def get_weather(city):
    """Fetch weather data for a specific city"""
    params = {
        'key': API_KEY,
        'q': city,
        'format': 'json'
    }
    
    try:
        response = requests.get(BASE_URL, params=params)
        response.raise_for_status()  # Raises exception for bad status codes
        
        weather_data = response.json()
        return {
            'city': weather_data['location']['name'],
            'temperature': weather_data['current']['temp_c'],
            'condition': weather_data['current']['condition']['text'],
            'humidity': weather_data['current']['humidity']
        }
    except requests.exceptions.RequestException as e:
        print(f"Error fetching weather data: {e}")
        return None

# Usage
toronto_weather = get_weather('Toronto')
if toronto_weather:
    print(f"Weather in {toronto_weather['city']}: {toronto_weather['temperature']}°C")
    print(f"Conditions: {toronto_weather['condition']}")
    print(f"Humidity: {toronto_weather['humidity']}%")

4. Advanced JSON Processing

Beyond basic API calls, you'll often need to process complex JSON structures, handle errors gracefully, and work with local JSON files.

Reading and Writing JSON Files

import json

# Writing data to JSON file
user_preferences = {
    "theme": "dark",
    "language": "en",
    "notifications": {
        "email": True,
        "push": False,
        "sms": True
    },
    "recent_searches": ["python", "json", "api"]
}

# Save to file
with open('user_preferences.json', 'w') as file:
    json.dump(user_preferences, file, indent=2)

# Reading from JSON file
with open('user_preferences.json', 'r') as file:
    loaded_preferences = json.load(file)
    
print(f"User theme: {loaded_preferences['theme']}")
print(f"Email notifications: {loaded_preferences['notifications']['email']}")

# Working with JSON strings
json_string = json.dumps(user_preferences, indent=2)
parsed_data = json.loads(json_string)

Error Handling and Data Validation

import requests
import json
from typing import Dict, Optional

def fetch_user_data(user_id: int) -> Optional[Dict]:
    """Fetch user data with comprehensive error handling"""
    url = f"https://api.example.com/users/{user_id}"
    
    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        
        data = response.json()
        
        # Validate required fields
        required_fields = ['id', 'name', 'email']
        for field in required_fields:
            if field not in data:
                print(f"Warning: Missing required field '{field}'")
                return None
        
        # Process and clean data
        return {
            'id': data['id'],
            'name': data['name'].strip(),
            'email': data['email'].lower(),
            'created_at': data.get('created_at'),
            'is_active': data.get('is_active', True)
        }
        
    except requests.exceptions.Timeout:
        print("Request timed out")
    except requests.exceptions.ConnectionError:
        print("Connection error")
    except requests.exceptions.HTTPError as e:
        print(f"HTTP error: {e}")
    except json.JSONDecodeError:
        print("Invalid JSON response")
    except Exception as e:
        print(f"Unexpected error: {e}")
    
    return None

# Usage with error handling
user = fetch_user_data(123)
if user:
    print(f"User: {user['name']} ({user['email']})")
else:
    print("Failed to fetch user data")

5. Real-World Applications

Let's look at practical examples that combine dictionaries and JSON processing:

Configuration Management

# config.py - Application configuration management
import json
import os
from typing import Dict, Any

class ConfigManager:
    def __init__(self, config_file: str = 'config.json'):
        self.config_file = config_file
        self.config = self.load_config()
    
    def load_config(self) -> Dict[str, Any]:
        """Load configuration from file with defaults"""
        default_config = {
            'database': {
                'host': 'localhost',
                'port': 5432,
                'name': 'myapp'
            },
            'api': {
                'rate_limit': 100,
                'timeout': 30
            },
            'features': {
                'debug_mode': False,
                'cache_enabled': True
            }
        }
        
        if os.path.exists(self.config_file):
            try:
                with open(self.config_file, 'r') as file:
                    file_config = json.load(file)
                    # Merge with defaults
                    return self.merge_configs(default_config, file_config)
            except Exception as e:
                print(f"Error loading config: {e}")
        
        return default_config
    
    def merge_configs(self, default: Dict, override: Dict) -> Dict:
        """Recursively merge configuration dictionaries"""
        result = default.copy()
        for key, value in override.items():
            if key in result and isinstance(result[key], dict) and isinstance(value, dict):
                result[key] = self.merge_configs(result[key], value)
            else:
                result[key] = value
        return result
    
    def get(self, key_path: str, default=None):
        """Get configuration value using dot notation"""
        keys = key_path.split('.')
        value = self.config
        
        for key in keys:
            if isinstance(value, dict) and key in value:
                value = value[key]
            else:
                return default
        
        return value

# Usage
config = ConfigManager()
db_host = config.get('database.host')  # 'localhost'
debug_mode = config.get('features.debug_mode')  # False
api_timeout = config.get('api.timeout', 60)  # 30 or 60 if not found

Data Processing Pipeline

# data_processor.py - Processing API data
import requests
from typing import List, Dict
from datetime import datetime

class DataProcessor:
    def __init__(self, api_base_url: str):
        self.api_base_url = api_base_url
        self.processed_data = []
    
    def fetch_and_process_data(self, endpoints: List[str]) -> Dict:
        """Fetch data from multiple endpoints and process"""
        results = {
            'timestamp': datetime.now().isoformat(),
            'sources': {},
            'summary': {}
        }
        
        for endpoint in endpoints:
            try:
                url = f"{self.api_base_url}/{endpoint}"
                response = requests.get(url)
                response.raise_for_status()
                
                data = response.json()
                processed = self.process_endpoint_data(endpoint, data)
                results['sources'][endpoint] = processed
                
            except Exception as e:
                results['sources'][endpoint] = {'error': str(e)}
        
        results['summary'] = self.generate_summary(results['sources'])
        return results
    
    def process_endpoint_data(self, endpoint: str, data: Dict) -> Dict:
        """Process data based on endpoint type"""
        if endpoint == 'users':
            return {
                'total_users': len(data.get('users', [])),
                'active_users': len([u for u in data.get('users', []) if u.get('active')]),
                'user_roles': self.count_by_field(data.get('users', []), 'role')
            }
        elif endpoint == 'orders':
            orders = data.get('orders', [])
            return {
                'total_orders': len(orders),
                'total_revenue': sum(order.get('amount', 0) for order in orders),
                'order_statuses': self.count_by_field(orders, 'status')
            }
        else:
            return {'raw_data': data}
    
    def count_by_field(self, items: List[Dict], field: str) -> Dict:
        """Count occurrences of field values"""
        counts = {}
        for item in items:
            value = item.get(field, 'unknown')
            counts[value] = counts.get(value, 0) + 1
        return counts
    
    def generate_summary(self, sources: Dict) -> Dict:
        """Generate overall summary from all sources"""
        summary = {'total_records': 0, 'successful_sources': 0, 'failed_sources': 0}
        
        for source, data in sources.items():
            if 'error' in data:
                summary['failed_sources'] += 1
            else:
                summary['successful_sources'] += 1
                # Add source-specific totals
                if 'total_users' in data:
                    summary['total_records'] += data['total_users']
                elif 'total_orders' in data:
                    summary['total_records'] += data['total_orders']
        
        return summary

# Usage
processor = DataProcessor('https://api.mycompany.com')
results = processor.fetch_and_process_data(['users', 'orders', 'products'])

print(f"Processed data from {results['summary']['successful_sources']} sources")
print(f"Total records: {results['summary']['total_records']}")

# Save results
with open(f"data_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json", 'w') as f:
    json.dump(results, f, indent=2)

âš¡ Performance Tip: When working with large JSON files, consider using streaming JSON parsers like ijson for memory-efficient processing of massive datasets.

Best Practices and Security

Key Guidelines

  • Use environment variables: Store API keys and sensitive data securely
  • Implement proper error handling: Always handle network and parsing errors
  • Validate data: Check for required fields and data types
  • Use type hints: Improve code readability and catch errors early
  • Implement rate limiting: Respect API rate limits and implement backoff strategies
  • Cache responses: Avoid unnecessary API calls for frequently accessed data
  • Log appropriately: Log errors and important events without exposing sensitive data

Next Steps

With dictionaries and JSON mastery under your belt, explore advanced topics like database integration, asynchronous programming, and building REST APIs to create powerful, data-driven applications.

Working on a project that involves API integration or complex data processing? I'd be happy to help you design efficient, scalable solutions using these Python techniques.

Share this post