# Election Data Indices Documentation This directory contains comprehensive indices for all processed election data. ## Index Files ### 1. index_results.json **Purpose**: Complete index of all election results by year, position, and state. **Structure**: ```json { "2002": { "president": { "SP": { "totalCities": 645, "cities": ["SP_SAO_PAULO", "SP_CAMPINAS", ...], "path": "results/2002/president/SP/" } } } } ``` **Use Cases**: - Find all cities with results for a specific year/position/state - Get file paths for accessing specific result data - Count total cities per state/position ### 2. index_candidates_{position}.json **Purpose**: Index of all candidates by position and year. **Files**: One file per position (e.g., index_candidates_president.json) **Structure**: ```json { "position": "president", "totalYears": 1, "totalCandidates": 6, "years": { "2002": [ { "filename": "PT_LULA.json", "path": "candidates/2002/president/", "state": "BR" } ] } } ``` **Use Cases**: - Find all candidates for a specific position - Get file paths for candidate data - Track candidates across years ### 3. index_years.jsonl **Purpose**: Summary of all years covered by the dataset. **Format**: JSON Lines (one JSON object per line) **Structure**: ```json {"year": 2002, "electionType": "general", "positions": ["president", "governor"], "states": ["SP", "RJ"], "totalCities": 5565} ``` **Use Cases**: - Get overview of available years - Understand election types per year - Count total coverage ### 4. index_cities.jsonl **Purpose**: Complete list of all cities covered across all years and positions. **Format**: JSON Lines (one JSON object per line) **Structure**: ```json {"year": 2002, "state": "SP", "city": "SAO_PAULO", "position": "president", "electionType": "general"} ``` **Use Cases**: - Find all cities in a specific state/year - Track city coverage across positions - Geographic analysis ### 5. index_positions.jsonl **Purpose**: Complete list of all positions covered across all years. **Format**: JSON Lines (one JSON object per line) **Structure**: ```json {"year": 2002, "position": "president", "dataType": "results", "electionType": "general", "description": "President of Brazil"} ``` **Use Cases**: - Understand available positions per year - Track position coverage across years - Election type analysis ## Usage Examples ### Find all presidential results for São Paulo in 2002: ```python import json with open('indices/index_results.json') as f: results = json.load(f) sp_president_2002 = results['2002']['president']['SP'] print(f"Cities: {sp_president_2002['totalCities']}") ``` ### Get all candidates for a position: ```python import json with open('indices/index_candidates_president.json') as f: candidates = json.load(f) for year, year_candidates in candidates['years'].items(): print(f"Year {year}: {len(year_candidates)} candidates") ``` ### Process all years: ```python with open('indices/index_years.jsonl') as f: for line in f: year_data = json.loads(line) print(f"Year {year_data['year']}: {year_data['electionType']}") ``` ## File Organization All processed data follows this structure: ``` data/processed/ ├── candidates/{year}/{position}/{state}/ ├── results/{year}/{position}/{state}/ └── aggregates/{year}/{position}/ ``` Indices provide efficient access to this data without scanning directories.