Academy › Developer Accelerator › Module 02

Reviewing What You Didn't Write

AI-generated code has specific, consistent failure patterns — not random errors, but predictable categories of things it gets wrong or underweights. Knowing what these are makes your review faster and more reliable. This module builds the review discipline that makes AI-assisted development actually safe to ship.

⏱ 35–40 min 3 knowledge checks Insurance system context throughout

Dev Module

Your progress

AI's consistent failure patterns — what to look for first

Reviewing AI-generated code isn't the same as reviewing code written by a colleague. A colleague makes mistakes based on misunderstandings, time pressure, or gaps in their specific knowledge. AI makes mistakes based on its training — which means it has predictable, consistent failure patterns that you can learn to look for systematically.

Knowing these patterns doesn't mean you trust AI less — it means you review faster and more effectively, because you know where to concentrate your attention. The areas where AI is reliably strong (structure, syntax, naming, boilerplate) need less of your time. The areas where AI is consistently weak need more.

High frequency Invented API methods and properties

AI generates method and property names that look correct but don't exist in the actual API. This is the most common failure in Guidewire development — Gosu and Guidewire API calls that are plausible but fabricated. The code will compile in some cases (dynamic typing) and fail at runtime, or fail to compile entirely.

// AI might generate: policyPeriod.getActivePaymentPlans().filterByStatus("CURRENT")
// But filterByStatus() doesn't exist on PolicyPeriodPaymentPlans
// Actual: policyPeriod.ActivePaymentPlans.where(\pp -> pp.Status == "current")

High frequency Boundary condition errors

AI implements the main case correctly but gets boundary conditions wrong — off-by-one errors, incorrect inclusive/exclusive comparisons, wrong handling of exactly-at-threshold values. In insurance rating code, this directly affects premium calculation accuracy.

// Business rule: surcharge applies to drivers under 25
// AI often generates: if (age < 25) → wrong: misses 24.999...
// Or: if (age <= 24) → also wrong if dob-based calculation rounds differently
// Always trace: what happens at exactly age 25? Day before? Day of birthday?

High frequency Null and empty collection handling

AI frequently generates code that works for the happy path but throws NullPointerException or produces incorrect results when optional fields are null or collections are empty. Insurance policy data is full of optional fields — AI assumes they're populated.

// AI might generate (assuming drivers list is never empty):
// var primaryDriver = policy.Drivers.first()
// Will throw exception if Drivers is empty — possible during policy creation flow
// Better: policy.Drivers.HasElements ? policy.Drivers.first() : null

Medium frequency Outdated API patterns

AI training data includes older versions of APIs and frameworks. For Guidewire specifically, API patterns changed between major versions. The generated code may use deprecated methods or patterns that don't exist in your target version.

// Guidewire API patterns differ between PC 9.x and PC 10.x
// Always verify generated Guidewire API calls against your version's
// Studio API documentation before assuming the pattern is current

Medium frequency Missing error handling in integration code

AI generates happy-path integration code well but frequently under-implements error handling — missing retry logic, inadequate timeout handling, or insufficient logging. In insurance integrations where a failed external call can block a quote or claim, this matters.

// AI often generates the success path fully and error handling as:
// } catch (Exception e) { log.error("Failed", e); }
// Missing: retry logic, circuit breaker, graceful degradation,
// structured error response that the calling workflow can act on

Medium frequency Transaction and concurrency blindness

AI rarely considers transaction boundaries, concurrency issues, or the Guidewire bundle model. In Guidewire development, understanding when a bundle commits and what operations are safe within vs. outside a bundle transaction is critical — AI-generated Gosu often ignores this entirely.

// AI may generate code that modifies entities outside a bundle context
// or assumes committed state when working with draft PolicyPeriods
// Always ask: does this code run in a bundle? What's the commit point?
// What happens if the operation fails mid-way?

Security considerations AI consistently underweights

AI generates secure-looking code. The problem is that it generates plausible security patterns without consistently applying them, and without understanding the specific threat model of your environment. In insurance IT systems that hold personal health information, financial data, and policy records, security gaps aren't just technical debt — they're regulatory and reputational exposure.

🔒

Input validation gaps

AI validates obvious cases but frequently misses context-specific injection risks. Any input that flows through to a database query, file path, or external API call needs validation that AI may not generate. Review every external input path — especially in Guidewire integration handlers where external systems send data into your policy or claims workflow.

🗝️

Credential and secret handling

AI will sometimes generate code with hardcoded credentials, connection strings, or API keys as examples — and these sometimes make it into codebases. Review every generated file for any string that looks like a credential. All secrets belong in environment configuration, not in code.

📋

Logging sensitive data

AI generates logging that is often too verbose for production insurance systems — potentially including personal information, financial data, or credentials in log output. Review every log statement: what data is included? Who has access to these logs? Does it comply with your client's data handling requirements?

🔐

Authorisation gaps

AI implements authentication patterns better than authorisation. Generated code may authenticate a user correctly but then allow them to access resources without checking whether they're authorised for that specific operation. In Guidewire systems, this intersects with the ACL and permission framework — AI-generated bypass paths can be subtle.

📦

Dependency security

When AI suggests adding a dependency — a library, a package, a utility — it doesn't perform security assessment on that dependency. Any AI-suggested dependency needs a vulnerability scan and an assessment of whether it's approved for use in your client's environment before it enters the codebase.

💾

Data at rest and in transit

AI will implement encryption when you ask for it explicitly. It will frequently omit it when the requirement is implied or when it's not the primary focus of the prompt. Any code that handles personal information, financial data, or credentials should be reviewed specifically for whether data is appropriately protected at rest and in transit.

Knowledge Check

AI generates an integration handler that logs the full request and response payloads from a third-party MVR (motor vehicle record) service. The log entries include driver name, date of birth, licence number, and driving history. This is a Guidewire implementation for an Ontario insurer. What is the correct response?

Correct response and correct reasoning. AI generated verbose logging because verbose logging is useful for debugging — it had no way to apply your client's data privacy requirements to its output. That's your job. Personal information in log files creates real exposure: logs are typically stored in systems with broader access than the application database, may be retained longer than policy data, and may be accessible to more people including external monitoring services. "Fix it later" and "lower log level" both leave the personal data in the log. The fix is to not log it — log metadata and correlation IDs instead. Flag it as a privacy item so it gets formal review.

Option 4 is the correct response. Personal information cannot be cleaned up later — once it's been logged in any environment, it's already been exposed to whoever has log access in that environment. Reducing the log level doesn't change what's in the log, only how often it appears. Checking with the development lead is appropriate for uncertain cases, but logging personal health and driving data in plain text is not an uncertain case — it's a clear privacy compliance issue that needs to be fixed before the code ships to any environment. Structure your logging to capture operational data without personal data.

The AI code review checklist

A checklist doesn't replace judgment — but it prevents the selective attention problem that makes reviews unreliable. When you're working fast, you unconsciously skip sections you feel less concerned about. A checklist enforces systematic coverage. Use this as a starting point and adapt it to your specific environment and client's requirements.

AI-generated code review checklist

API method verification: Every method call on a Guidewire or third-party API has been verified to exist in the version you're targeting. No invented method names.

Boundary condition tracing: Logic has been manually traced through every boundary case — at-threshold, just-below, just-above. Comparisons use the correct operator (< vs <=) for the business rule.

Null and empty handling: Every optional field access, every collection iteration, every first() or get(0) call has a null/empty check. Tested mentally with null inputs and empty collections.

Error handling completeness: Integration calls have retry logic, timeout handling, and graceful degradation. Exceptions are caught at the right level, logged with context, and surfaced appropriately to the calling code.

No hardcoded values: No URLs, credentials, thresholds, or environment-specific values hardcoded. All configurable values referenced via configuration, not embedded in code.

Log content review: Every log statement reviewed for personal information, credentials, or sensitive data. Log level appropriate for the event. Correlation IDs included for integration tracing.

Transaction context: (Guidewire) Code correctly positioned relative to bundle commit points. Operations that modify entities happen within the appropriate bundle context. Rollback behaviour considered.

Authorisation checks: Any operation that retrieves or modifies sensitive data includes appropriate permission checks, not just authentication. Role-based access verified against security design.

"Can I explain every line?": You can articulate what each significant block does and why, without referring back to AI. If you can't explain a section, you haven't reviewed it — you've skimmed it.

Knowledge Check

You're reviewing AI-generated Gosu code that implements a cancellation refund calculation. You verify the business logic looks correct and the boundary conditions seem right. There's one method — calculateProRataRefund() — that uses an approach you don't fully understand. You're on a deadline. What should you do?

Correct. The "can I explain every line" test is absolute, not approximate. A refund calculation affects real policyholder money — an error that consistently under-refunds by even a small percentage represents a financial harm and potential regulatory issue across thousands of policy cancellations. "Deadline" is the most common reason developers skip understanding sections of code — and the most common reason production financial calculation defects slip through. The right response: ask AI to explain the approach, verify with a hand-traced example, and resolve the understanding gap before submitting. This takes 15-30 minutes. A production refund defect takes weeks to diagnose, fix, and remediate.

Option 3 is the correct professional response. "Probably fine if the overall approach is correct" is a rationalization, not a verification. Flagging in a comment offloads your review responsibility to someone else — the reviewer will assume you understood the code you submitted. Deferring to a senior review introduces a gate that may not actually happen before production. The minimum standard is: you understand every significant section. When you don't, you find out — not by hoping, but by asking AI to explain it and verifying with a concrete example. Financial calculations have zero tolerance for unexplained behaviour.

Using AI to review AI — a legitimate technique with limits

Using AI to critique its own output is a legitimate and useful technique — with important limits. AI can catch certain categories of its own errors when prompted to look for them specifically. What it cannot reliably do is catch domain-specific logic errors in insurance business rules it doesn't deeply understand, or identify security gaps that require knowledge of your specific environment and threat model.

Prompt — AI review of AI-generated code

Context I'm reviewing the following Gosu code I generated with AI assistance. This runs in Guidewire PolicyCenter 10.x and implements payment plan eligibility logic for Ontario personal auto policies.

Task Review this code and identify: 1) any method or property calls that may not exist in the Guidewire PolicyCenter Gosu API, 2) boundary condition issues — particularly comparisons that might be off-by-one or use the wrong operator, 3) null pointer risks — places where a null value could cause a runtime exception, 4) any hardcoded values that should be configurable, 5) anything that looks like it would behave unexpectedly in a Guidewire bundle transaction context.

Code to review [paste the generated code here]

Format For each issue found: line reference, issue category, description of the concern, and suggested correction. If you are uncertain whether a Guidewire API method exists, flag it explicitly for manual verification rather than assuming it's correct. Do not invent corrections — note where I need to verify against documentation.

What AI self-review catches vs. what it misses

AI self-review reliably catches: obvious null risks, some boundary condition issues, hardcoded values, structural problems. It reliably misses: whether the business logic is actually correct for your specific insurance product and jurisdiction, whether the code complies with your client's specific regulatory environment, Guidewire version-specific API differences, and security gaps that require knowledge of your deployment environment. AI self-review is a useful first pass — it doesn't replace your professional review.

Knowledge Check

You run an AI self-review prompt on your generated code. AI reports: "No issues identified. The code looks correct and follows good practices." You still haven't manually traced the boundary conditions in the premium calculation logic. What do you do?

Correct. An "all clear" from AI self-review is not the same as a verified review. AI can miss boundary condition issues — especially in insurance business rules where correct behaviour at thresholds depends on understanding the specific rule, not just the code structure. "AI found nothing" doesn't change your review obligation — it just means the self-review pass was done. Your manual boundary condition tracing is still required. A second AI pass on the same content will likely produce a similar result. The trace you need to do requires your domain knowledge, not AI's pattern matching.

Option 3 is the correct response. AI self-review "no issues" is a data point, not a certification. AI can miss things — particularly logic errors in domain-specific code where correct behaviour requires understanding the insurance business rule, not just the code structure. A second AI prompt asking specifically about boundary conditions may catch something the first missed — but your own manual trace through the calculation with concrete values is the definitive check that your professional responsibility requires. AI self-review assists your review; it doesn't replace it.

Module summary

✅

Know the failure patterns

Invented API methods, boundary condition errors, null handling gaps, outdated patterns, incomplete error handling, transaction blindness. Systematic review concentrates on these — they're predictable, not random.

✅

Security is your job

AI generates plausible-looking security patterns but doesn't know your environment, your client's data handling requirements, or your regulatory context. Personal data in logs, hardcoded credentials, authorisation gaps — you find these, not AI.

✅

Systematic checklist, not selective attention

A checklist prevents the "I was confident so I skimmed it" failure mode. Apply it to every AI-generated function before submission. Adapt it to your specific environment. The "can I explain every line" test is the final gate.

✅

AI self-review: useful first pass only

Run it — it catches some things fast. But "AI found nothing" does not complete your review obligation. Business rule boundary conditions and domain-specific logic errors require your judgment. AI self-review assists; it doesn't certify.

Ready for Module 03

Module 03 — Debugging and Root Cause — covers the other direction: using AI to diagnose problems in code that already exists. Interpreting Guidewire server logs, tracing integration failures, identifying root cause in complex multi-system environments faster than reading through logs manually.

✓

Module 02 Complete

Reviewing What You Didn't Write is done. Continue to Module 03: Debugging and Root Cause.

Back to Pathway Continue to Module 03 →