Alle AI-Trends
Direkt in dein Postfach
Erhalte exklusive AI-Tutorials, Praxistipps und KI-News direkt in dein Postfach.
*Mit deiner Anmeldung akzeptierst du unsere Datenschutzrichtlinien.
Jetzt neu bei Byte: Unser WhatsApp Channel 📱

GDPval-AA

Veröffentlichung
Oktober 2025
Score-Bereich
Unbekannt
Modelle getestet
2
Agentische AufgabenWissen & Sprache
Experte

GDPval-AA — Übersicht

GDPval-AA ist die Benchmark von Artificial Analysis für OpenAIs GDPval-Datensatz. GDPval testet KI-Modelle in realen, wirtschaftlich relevanten Aufgaben aus 44 verschiedenen Berufen aus 9 Wirtschaftssektoren. Im GDPval Benchmark erhalten die LLMs in einem agentischen Loop Zugang zu Shell, Webbrowser und weiteren Tools, um professionelle Arbeitsergebnisse wie Dokumente, Präsentationen, Tabellenkalkulationen und Diagramme zu erstellen. Die Bewertung erfolgt über paarweise Vergleiche mit anderen LLMs, wodurch ein ELO-Score erstellt werden kann. Als Ankerpunkt wird GPT-5.1 (Non-Reasoning) mit einem ELO von 1000 gesetzt. GDPval-AA misst, wie nah KI-Modelle an die Qualität menschlicher Arbeit von Personen mit durchschnittlich 14 Jahren Berufserfahrung herankommen.

GDPval-AA Leaderboard

Ranking aller getesteten Modelle im GDPval-AA Benchmark, sortiert nach Score.



Beispielaufgaben aus dem GDPval-AA Benchmark

Die folgenden Beispielaufgaben zeigen typische Fragestellungen, die im GDPval-AA Benchmark vorkommen.

You are an auditor and as part of an audit engagement, you are tasked with reviewing and testing the accuracy of reported Anti-Financial Crime Risk Metrics. The attached spreadsheet titled 'Population' contains Anti-Financial Crime Risk Metrics for Q2 and Q3 2024. Using the data in the 'Population' spreadsheet, complete the following: 1. Calculate the required sample size for audit testing based on a 90% confidence level and a 10% tolerable error rate. 2. Perform a variance analysis on Q2 and Q3 data - Calculate quarter-on-quarter variance. 3. Select a sample for audit testing based on criteria including metrics with >20% variance, specific entities, metrics with higher risk weightings, zero values, specific business lines, geographic coverage, and divisional coverage. 4. Create a new spreadsheet titled 'Sample' with selected sample data and workings.

Excel spreadsheet with Sample Size Calculation tab (z=1.645, p=0.5, e=0.10 with finite population correction), variance analysis (Q3-Q2)/Q2 in column J, and Sample spreadsheet with selected rows meeting all specified criteria including geographic coverage (Italy, Greece, Luxembourg, Brazil, UAE, Cayman Islands, Pakistan) and divisional coverage.

You are the Finance Lead for an advisory client and are responsible for managing and controlling expenses related to their professional music engagements. Prepare a structured Excel profit and loss report summarizing the 2024 Fall Music Tour (October 2024). Include breakdown of income and costs by source (Tour Manager vs. Production Company), revenue with line-by-line summary by city and country, foreign tax withholding by country (UK 20%, France 15%, Spain 24%, Germany 15.825%), all revenue in USD, expense categories (Band & Crew, Other Tour Costs, Hotel & Restaurants, Other Travel Costs), and Net Income calculation.

Excel P&L report showing Tour stops (London UK $230,754; Paris 2x $175,880 & $168,432; Barcelona $125,932; Madrid $110,823; Munich $99,117; Berlin $132,812), Total Gross Revenue $1,043,750, Total Withholding $191,322, Total Net Revenue $852,428, Band & Crew $106,160 combined, Total Expenses $732,006, Total Net Income $120,423.

You are a Senior Staff Accountant at Aurisic. Prepare a detailed amortization schedule for all of Aurisic's prepaid expenses and insurance through April 2025. Create three Excel tabs: 1) Prepaid Summary with GL balances and YTD amortization, 2) Prepaid Expenses (Account #1250) detailed schedule with monthly activity, 3) Prepaid Insurance (Account #1251) detailed schedule for insurance policies. GL balances must reconcile for Jan-Apr 2025 using straight-line amortization with default 6-month term if not specified.

Excel workbook with three tabs showing Prepaid Expenses (1250) April balance $559,377.61, Prepaid Insurance (1251) April balance $369,976.70, Total Prepaid $929,354.31, with zero variance between calculated and GL balances for all months, vendor organization, and monthly addition/amortization/balance summaries.

You are a mid-level Tax Preparer at an accounting firm. Complete an Individual Tax return (form 1040) for Bob and Lisa Smith using the provided 2024 tax documents. The 1040 should be provided in PDF form, and should include any Schedules or Forms that would be required to be e-filed with the Form 1040 according to current IRS regulations for the 2024 tax year.

PDF Form 1040 (Married Filing Jointly) for Robert Smith Jr. (SSN 333-44-5555) and Lisa M. Smith (SSN 444-55-6666), Line 1a Wages $327,003, Line 2b Taxable Interest $1,116, Line 3b Ordinary Dividends $6,744, Line 15 Taxable Income $329,930, Line 24 Total Tax $58,146, Refund $14,953, with Schedules 1, 2, 3, A, B, D, Forms 8949, 2441, 8812, 8995, 8959, 8960.

Develop a formal report for the Chief of Police regarding the procurement of new duty rifles for departmental issuance. The report should include research on available options, cost analysis, and recommendations based on departmental needs and budget constraints.

Formal procurement report document with research on available duty rifle options, comparative cost analysis, budget impact assessment, and evidence-based recommendations for departmental issuance.