Title
Bug: Inconsistent parsing of financial statements – list items vs paragraphs for similar structures
Description
I am observing inconsistent parsing behavior for financial statement tables across different PDFs.
Example
For FY 2022–23:
- Data is parsed as structured
list and list item elements
- Hierarchy (I, II, III, a, b, c) is preserved
For FY 2024–25:
- Similar data is parsed as plain
paragraph
- No structure or hierarchy detected
Expected Behavior
Similar structured financial statements should produce consistent structured output (list/table format).
Actual Behavior
- One file → structured list output
- Another similar file → flat paragraph output
Sample Output
2022-2023 Profit loss
{'type': 'paragraph', 'id': 8175, 'page number': 134, 'bounding box': [56.693, 719.761, 210.733, 730.165], 'font': 'HelveticaLTStd-Roman', 'font size': 9.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'for the year ended on 31st March, 2023'}
{'type': 'paragraph', 'id': 8177, 'page number': 134, 'bounding box': [59.525, 684.966, 545.669, 713.31], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '(` In Lakhs) Particulars Notes'}
{'type': 'heading', 'id': 8178, 'level': 'Subtitle', 'page number': 134, 'bounding box': [395.037, 679.941, 466.299, 699.462], 'heading level': 35, 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[1.0, 1.0, 1.0]', 'content': 'For the year ended 31st March, 2023'}
{'type': 'paragraph', 'id': 8179, 'page number': 134, 'bounding box': [478.475, 689.965, 545.667, 699.213], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'For the year ended'}
{'type': 'paragraph', 'id': 8180, 'page number': 134, 'bounding box': [59.525, 666.261, 545.669, 689.213], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '31st March, 2022 Revenue'}
{'type': 'list', 'id': 8181, 'level': '9', 'page number': 134, 'bounding box': [59.469, 258.365, 545.669, 662.205], 'numbering style': 'roman numbers', 'number of list items': 11,
'list items': [{'type': 'list item', 'page number': 134, 'bounding box': [59.525, 652.373, 545.669, 662.205], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'I. Revenue from Operations 27 59,780.35 51,712.50', 'kids': []},
{'type': 'list item', 'page number': 134, 'bounding box': [59.525, 638.677, 545.669, 648.509], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'II. Other income 28 661.40 582.53', 'kids': []},
{'type': 'list item', 'page number': 134, 'bounding box': [59.525, 625.173, 545.669, 634.693], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'III. Total Income (I+II) 60,441.75 52,295.03', 'kids': []},
{'type': 'list item', 'page number': 134, 'bounding box': [59.525, 611.493, 111.053, 621.013], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'IV. Expenses',
'kids': [{'type': 'list', 'id': 2037, 'level': '10', 'page number': 134, 'bounding box': [73.699, 505.621, 545.669, 607.437], 'numbering style': 'english letters', 'number of list items': 6, 'list items': [{'type': 'list item', 'page number': 134, 'bounding box': [73.701, 597.605, 545.669, 607.437], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'a) Cost of materials consumed 29 31,058.53 26,617.63', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.701, 573.909, 302.629, 593.741], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'b) Changes in inventories of finished goods, stock in trade and work-in-progress', 'kids': [{'type': 'paragraph', 'id': 2036, 'page number': 134, 'bounding box': [368.309, 583.909, 545.669, 593.741], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '30 17.03 92.73'}]}, {'type': 'list item', 'page number': 134, 'bounding box': [73.701, 560.213, 545.669, 570.045], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'c) Employee Benefits Expenses 31 4,774.76 3,944.08', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.701, 546.517, 545.669, 556.349], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'd) Finance Costs 32 1,499.73 1,800.13', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.699, 532.821, 545.669, 542.653], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'e) Depreciation,\tAmortisation\tand\tImpairment\texpense 33 1,163.19 1,180.92', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.701, 505.621, 545.669, 528.957], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'f) Other Expenses 34 17,064.66 14,452.45 Total Expenses (IV) 55,577.90 48,087.94', 'kids': []}]}]}, {'type': 'list item', 'page number': 134, 'bounding box': [59.525, 491.749, 545.661, 501.581], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'V. Profit\tBefore\tExceptional\tItems\tand\tTax\t(III-IV) 4,863.85 4,207.09', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.517, 478.053, 545.653, 487.885], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'VI. Exceptional\tItems - -', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.509, 464.549, 545.653, 474.069], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'VII. Profit Before Tax (V-VI) 4,863.85 4,207.09', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.509, 409.573, 545.653, 460.493], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'VIII. Tax expense: 36 Current Tax 1,285.13 1,041.65 Deferred\tTax (36.51) 78.49 Total Tax Expenses 1248.62 -', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.501, 396.069, 545.645, 405.589], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'IX. Profit for the period (VII-VIII) 3,615.23 3,086.95', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.501, 382.181, 377.181, 392.013], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'X. Other Comprehensive Income 37', 'kids': [{'type': 'list', 'id': 2038, 'level': '10', 'page number': 134, 'bounding box': [73.669, 327.397, 548.301, 378.317], 'numbering style': 'english letters', 'number of list items': 2, 'list items': [{'type': 'list item', 'page number': 134, 'bounding box': [73.677, 354.789, 548.301, 378.317], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'A. (i) Items that will not be reclassified to profit or loss (11.19) 463.57 (ii) Income tax related to items that will not be reclassified to profit or loss 2.95 (108.22)', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.669, 327.397, 545.637, 350.925], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'B. (i) Items that will be reclassified to profit or loss - (ii) Income tax related to items that will be reclassified to profit or loss - -', 'kids': []}]}, {'type': 'paragraph', 'id': 2039, 'page number': 134, 'bounding box': [59.485, 313.893, 545.621, 323.413], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'Total Other Comprehensive Income (X) (8.24) 355.35'}]}, {'type': 'list item', 'page number': 134, 'bounding box': [59.469, 258.365, 545.621, 309.733], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'XI. Total Comprehensive Income for the period (IX+X) 3,606.99 3,442.30 Earnings\tper\tequity\tshare\tof\tFace\tValue\tof C 5 each 38 Basic 10.25 8.75 Diluted 10.25 8.75', 'kids': []}]}
2024 - 2025 Profit and loss
{'type': 'paragraph', 'page number': 170, 'bounding box': [56.693, 719.761, 196.333, 730.165], 'font': 'HelveticaLTStd-Roman', 'font size': 9.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'for the Year Ended 31st March 2025'}
{'type': 'paragraph', 'page number': 170, 'bounding box': [506.114, 700.309, 545.294, 709.527], 'font': 'HelveticaLTStd-Light', 'font size': 7.5, 'text color': '[0.0, 0.0, 0.0]', 'content': '(In Lakhs)'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.528, 686.648, 468.152, 696.168], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[1.0, 1.0, 1.0]', 'content': 'Particulars Note No. For the Year ended '} {'type': 'paragraph', 'page number': 170, 'bounding box': [478.444, 686.672, 547.516, 695.92], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'For the Year ended '} {'type': 'paragraph', 'page number': 170, 'bounding box': [407.328, 676.648, 465.924, 686.168], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[1.0, 1.0, 1.0]', 'content': '31st March 2025'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 662.968, 545.294, 685.92], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '31st March 2024 Revenue'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 608.184, 545.294, 658.912], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'I. Revenue from Operations 29 79,491.98 67,245.00 II. Other Income 30 917.07 853.71 III. Total Income (I+II) 80,409.05 68,098.71 IV. Expenses'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 580.6, 545.294, 604.128], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'a) Cost of Materials Consumed 31 42,410.77 35,684.48 b) Changes in Inventories of Finished Goods, Stock in Trade and Work In '} {'type': 'paragraph', 'page number': 170, 'bounding box': [367.934, 580.6, 545.294, 590.432], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '32 (443.49) (72.35)'} {'type': 'paragraph', 'page number': 170, 'bounding box': [87.878, 570.6, 119.59, 580.432], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'Progress'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 515.816, 545.294, 566.736], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'c) Employee Benefits Expenses 33 6,382.87 5,410.07 d) Finance Costs 34 1,572.66 1,292.05 e) Depreciation, Amortisation and Impairment Expense 35 1,506.76 1,158.88 f) Other Expenses 36 21,406.31 17,651.18'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 447.336, 545.294, 511.832], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'Total Expenses (IV) 72,835.88 61,124.32 V. Profit Before Exceptional Items and Tax(III-IV) 7,573.17 6,974.41 VI. Exceptional Items 52 203.50 155.56 VII. Profit Before Tax (V-VI) 7,369.67 6,818.84 VIII. Tax Expense: 38'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 406.44, 545.294, 443.472], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'Current Tax 1,828.09 1,750.26 Deferred Tax (94.41) 53.30 Total Tax Expense 1,733.68 1,803.56'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 378.856, 545.294, 402.264], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'IX. Profit for the year(VII-VIII) 5,635.98 5,015.28 X. Other Comprehensive Income 39'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 365.16, 545.294, 374.992], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'A. (i) Items that will not be reclassified to profit or loss 532.36 804.63'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 337.768, 545.294, 361.296], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '(ii) Income tax related to items that will not be reclassified to profit or loss 58.12 (187.11) B. (i) Items that will be reclassified to profit or loss - -'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 240.952, 545.294, 333.904], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '(ii) Income tax related to items that will be reclassified to profit or loss - Total Other Comprehensive Income (X) 590.48 617.52 XI. Total Comprehensive Income for the year(IX+X) 6,226.46 5,632.81 Earnings per equity share of Face Value of5 each 40 Basic 15.97 14.21 Diluted 15.97 14.21 See accompanying notes to the financial statements 1 to 53'}
Observations
- Borderless tables and alignment-based layouts are not consistently detected
- Financial statements are especially affected
Impact
This inconsistency makes it difficult to build reliable downstream pipelines (e.g., financial data extraction, RAG systems).
Additional Context
Both PDFs are visually similar but produce very different structured outputs.

Title
Bug: Inconsistent parsing of financial statements – list items vs paragraphs for similar structures
Description
I am observing inconsistent parsing behavior for financial statement tables across different PDFs.
Example
For FY 2022–23:
listandlist itemelementsFor FY 2024–25:
paragraphExpected Behavior
Similar structured financial statements should produce consistent structured output (list/table format).
Actual Behavior
Sample Output
2022-2023 Profit loss
{'type': 'paragraph', 'id': 8175, 'page number': 134, 'bounding box': [56.693, 719.761, 210.733, 730.165], 'font': 'HelveticaLTStd-Roman', 'font size': 9.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'for the year ended on 31st March, 2023'}
{'type': 'paragraph', 'id': 8177, 'page number': 134, 'bounding box': [59.525, 684.966, 545.669, 713.31], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '(` In Lakhs) Particulars Notes'}
{'type': 'heading', 'id': 8178, 'level': 'Subtitle', 'page number': 134, 'bounding box': [395.037, 679.941, 466.299, 699.462], 'heading level': 35, 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[1.0, 1.0, 1.0]', 'content': 'For the year ended 31st March, 2023'}
{'type': 'paragraph', 'id': 8179, 'page number': 134, 'bounding box': [478.475, 689.965, 545.667, 699.213], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'For the year ended'}
{'type': 'paragraph', 'id': 8180, 'page number': 134, 'bounding box': [59.525, 666.261, 545.669, 689.213], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '31st March, 2022 Revenue'}
{'type': 'list', 'id': 8181, 'level': '9', 'page number': 134, 'bounding box': [59.469, 258.365, 545.669, 662.205], 'numbering style': 'roman numbers', 'number of list items': 11,
'list items': [{'type': 'list item', 'page number': 134, 'bounding box': [59.525, 652.373, 545.669, 662.205], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'I. Revenue from Operations 27 59,780.35 51,712.50', 'kids': []},
{'type': 'list item', 'page number': 134, 'bounding box': [59.525, 638.677, 545.669, 648.509], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'II. Other income 28 661.40 582.53', 'kids': []},
{'type': 'list item', 'page number': 134, 'bounding box': [59.525, 625.173, 545.669, 634.693], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'III. Total Income (I+II) 60,441.75 52,295.03', 'kids': []},
{'type': 'list item', 'page number': 134, 'bounding box': [59.525, 611.493, 111.053, 621.013], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'IV. Expenses',
'kids': [{'type': 'list', 'id': 2037, 'level': '10', 'page number': 134, 'bounding box': [73.699, 505.621, 545.669, 607.437], 'numbering style': 'english letters', 'number of list items': 6, 'list items': [{'type': 'list item', 'page number': 134, 'bounding box': [73.701, 597.605, 545.669, 607.437], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'a) Cost of materials consumed 29 31,058.53 26,617.63', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.701, 573.909, 302.629, 593.741], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'b) Changes in inventories of finished goods, stock in trade and work-in-progress', 'kids': [{'type': 'paragraph', 'id': 2036, 'page number': 134, 'bounding box': [368.309, 583.909, 545.669, 593.741], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '30 17.03 92.73'}]}, {'type': 'list item', 'page number': 134, 'bounding box': [73.701, 560.213, 545.669, 570.045], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'c) Employee Benefits Expenses 31 4,774.76 3,944.08', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.701, 546.517, 545.669, 556.349], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'd) Finance Costs 32 1,499.73 1,800.13', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.699, 532.821, 545.669, 542.653], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'e) Depreciation,\tAmortisation\tand\tImpairment\texpense 33 1,163.19 1,180.92', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.701, 505.621, 545.669, 528.957], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'f) Other Expenses 34 17,064.66 14,452.45 Total Expenses (IV) 55,577.90 48,087.94', 'kids': []}]}]}, {'type': 'list item', 'page number': 134, 'bounding box': [59.525, 491.749, 545.661, 501.581], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'V. Profit\tBefore\tExceptional\tItems\tand\tTax\t(III-IV) 4,863.85 4,207.09', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.517, 478.053, 545.653, 487.885], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'VI. Exceptional\tItems - -', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.509, 464.549, 545.653, 474.069], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'VII. Profit Before Tax (V-VI) 4,863.85 4,207.09', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.509, 409.573, 545.653, 460.493], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'VIII. Tax expense: 36 Current Tax 1,285.13 1,041.65 Deferred\tTax (36.51) 78.49 Total Tax Expenses 1248.62 -', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.501, 396.069, 545.645, 405.589], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'IX. Profit for the period (VII-VIII) 3,615.23 3,086.95', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [59.501, 382.181, 377.181, 392.013], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'X. Other Comprehensive Income 37', 'kids': [{'type': 'list', 'id': 2038, 'level': '10', 'page number': 134, 'bounding box': [73.669, 327.397, 548.301, 378.317], 'numbering style': 'english letters', 'number of list items': 2, 'list items': [{'type': 'list item', 'page number': 134, 'bounding box': [73.677, 354.789, 548.301, 378.317], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'A. (i) Items that will not be reclassified to profit or loss (11.19) 463.57 (ii) Income tax related to items that will not be reclassified to profit or loss 2.95 (108.22)', 'kids': []}, {'type': 'list item', 'page number': 134, 'bounding box': [73.669, 327.397, 545.637, 350.925], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'B. (i) Items that will be reclassified to profit or loss - (ii) Income tax related to items that will be reclassified to profit or loss - -', 'kids': []}]}, {'type': 'paragraph', 'id': 2039, 'page number': 134, 'bounding box': [59.485, 313.893, 545.621, 323.413], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'Total Other Comprehensive Income (X) (8.24) 355.35'}]}, {'type': 'list item', 'page number': 134, 'bounding box': [59.469, 258.365, 545.621, 309.733], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'XI. Total Comprehensive Income for the period (IX+X) 3,606.99 3,442.30 Earnings\tper\tequity\tshare\tof\tFace\tValue\tof C 5 each 38 Basic 10.25 8.75 Diluted 10.25 8.75', 'kids': []}]}
2024 - 2025 Profit and loss
{'type': 'paragraph', 'page number': 170, 'bounding box': [56.693, 719.761, 196.333, 730.165], 'font': 'HelveticaLTStd-Roman', 'font size': 9.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'for the Year Ended 31st March 2025'}
{'type': 'paragraph', 'page number': 170, 'bounding box': [506.114, 700.309, 545.294, 709.527], 'font': 'HelveticaLTStd-Light', 'font size': 7.5, 'text color': '[0.0, 0.0, 0.0]', 'content': '(
In Lakhs)'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.528, 686.648, 468.152, 696.168], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[1.0, 1.0, 1.0]', 'content': 'Particulars Note No. For the Year ended '} {'type': 'paragraph', 'page number': 170, 'bounding box': [478.444, 686.672, 547.516, 695.92], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'For the Year ended '} {'type': 'paragraph', 'page number': 170, 'bounding box': [407.328, 676.648, 465.924, 686.168], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[1.0, 1.0, 1.0]', 'content': '31st March 2025'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 662.968, 545.294, 685.92], 'font': 'HelveticaLTStd-Roman', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '31st March 2024 Revenue'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 608.184, 545.294, 658.912], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'I. Revenue from Operations 29 79,491.98 67,245.00 II. Other Income 30 917.07 853.71 III. Total Income (I+II) 80,409.05 68,098.71 IV. Expenses'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 580.6, 545.294, 604.128], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'a) Cost of Materials Consumed 31 42,410.77 35,684.48 b) Changes in Inventories of Finished Goods, Stock in Trade and Work In '} {'type': 'paragraph', 'page number': 170, 'bounding box': [367.934, 580.6, 545.294, 590.432], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '32 (443.49) (72.35)'} {'type': 'paragraph', 'page number': 170, 'bounding box': [87.878, 570.6, 119.59, 580.432], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'Progress'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 515.816, 545.294, 566.736], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'c) Employee Benefits Expenses 33 6,382.87 5,410.07 d) Finance Costs 34 1,572.66 1,292.05 e) Depreciation, Amortisation and Impairment Expense 35 1,506.76 1,158.88 f) Other Expenses 36 21,406.31 17,651.18'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 447.336, 545.294, 511.832], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'Total Expenses (IV) 72,835.88 61,124.32 V. Profit Before Exceptional Items and Tax(III-IV) 7,573.17 6,974.41 VI. Exceptional Items 52 203.50 155.56 VII. Profit Before Tax (V-VI) 7,369.67 6,818.84 VIII. Tax Expense: 38'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 406.44, 545.294, 443.472], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'Current Tax 1,828.09 1,750.26 Deferred Tax (94.41) 53.30 Total Tax Expense 1,733.68 1,803.56'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 378.856, 545.294, 402.264], 'font': 'HelveticaLTStd-Bold', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'IX. Profit for the year(VII-VIII) 5,635.98 5,015.28 X. Other Comprehensive Income 39'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 365.16, 545.294, 374.992], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': 'A. (i) Items that will not be reclassified to profit or loss 532.36 804.63'} {'type': 'paragraph', 'page number': 170, 'bounding box': [73.702, 337.768, 545.294, 361.296], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '(ii) Income tax related to items that will not be reclassified to profit or loss 58.12 (187.11) B. (i) Items that will be reclassified to profit or loss - -'} {'type': 'paragraph', 'page number': 170, 'bounding box': [59.526, 240.952, 545.294, 333.904], 'font': 'HelveticaLTStd-Light', 'font size': 8.0, 'text color': '[0.0, 0.0, 0.0]', 'content': '(ii) Income tax related to items that will be reclassified to profit or loss - Total Other Comprehensive Income (X) 590.48 617.52 XI. Total Comprehensive Income for the year(IX+X) 6,226.46 5,632.81 Earnings per equity share of Face Value of5 each 40 Basic 15.97 14.21 Diluted 15.97 14.21 See accompanying notes to the financial statements 1 to 53'}Observations
Impact
This inconsistency makes it difficult to build reliable downstream pipelines (e.g., financial data extraction, RAG systems).
Additional Context
Both PDFs are visually similar but produce very different structured outputs.