Methodology
Data inputs (current raw files)
- EOG (Grades 3-8): Uses
ga_milestones_eogaggregate CSVs with columns likeLONG_SCHOOL_YEAR,INSTN_NUMBER,TEST_CMPNT_TYP_NM,NUM_TESTED_CNT,PROFICIENT_PCT, andDISTINGUISHED_PCT. We filter toAll Studentsonly and ignore grade-level split files (*_by_grade*) and subgroup rows. - CCRPI (K-8): Uses the School Grades dashboard export in
data/raw/2024SchoolGrades_dataand keepsLEVEL = SCHOOLrows. Component fields are pulled from the cluster-specific columns (E/M/H) when present; non-numeric values (e.g.,100.00+) are treated as missing. - HS outcomes: Uses
ga_hs_metricsCSVs and keeps school-level rows only: Graduation Rate (Grad Rate -ALL Students), AP (ALL Subjects), and ACT (Highest,All Students,Composite).
Percentiles
Percentiles are computed statewide per year on available values. For each metric column,
values are ranked within the same year using rank(pct=True) * 100. Ties share the
average rank; missing values are not ranked.
K-8 composites
K-8 proficiency is built from subject-level EOG results:
- Subject proficiency is
PL3% + PL4%when percent columns are present, or(PL3 + PL4) / tested * 100when only counts exist. - Overall proficiency is the weighted average of subject proficiency rates, weighted by the number tested in each subject for the same school-year.
- CCRPI components are joined by
school_id+year; when components are missing in the source, they remain null in the metrics output.
High school metrics
High school metrics are derived from ga_hs_metrics CSVs:
- Graduation rate uses school-level rows for
Grad Rate -ALL Students(percent value). - AP uses school-level
ALL Subjectsrows and computesap_3plus_rateasNUMBER_TESTS_3_OR_HIGHER / NUMBER_TESTS_TAKEN * 100. - ACT uses
Highestcomposite scores forAll Students.
Missing years
Some years (notably 2020 for EOG) are absent due to statewide assessment cancellations or gaps in source reporting. The pipeline does not impute or interpolate missing years; those years are simply absent from the metric tables and charts.