Editor’s Note — This article is sponsored by Prism Data. As with all sponsored content in Fintech Takes, this article was written, edited, and published by me, Alex Johnson. I hope you enjoy it!

“Lending money isn’t like playing tennis against a wall. There’s always someone on the other side of the net. Often there are multiple someones.”
I can’t remember who told me this, early in my career, but it stuck. And it has been on my mind since the French Open kicked off earlier this month.
Practitioners in lending — loan officers, credit analysts, data scientists — will often become obsessed with their craft. This isn’t a bad thing (you want them to obsess over their craft), but it can lead to problems if those practitioners (and the companies that employ them) lose sight of the context in which they deploy their skills.
It’s fun and rewarding to perfect your forehand by volleying with a wall, but that, by itself, won’t be much use in a match, where another player is standing on the side of the net, with plans to be much less cooperative than the wall.
Similarly, it’s fun and rewarding to design the perfect credit risk attribute or scoring model. However, that work can quickly be rendered useless if you lose sight of the fundamental reality that when you are lending money, there are always opponents on the other side of the net, studying your game and figuring out ways to beat you.
The trick to winning is ensuring that you have a well-rounded and adaptable style of play.
The recent evolution of banks’ fraud prevention strategies provides an instructive example.
Fraud Prevention
The Customer Identification Program (CIP) rule, implemented as a part of the USA PATRIOT Act in the early 2000s, required banks to develop policies and procedures to “form a reasonable belief” in customers’ true identities (i.e., Know Your Customer or KYC). In practice, this required banks to collect four core data elements — name, address, date of birth, and social security number/tax ID — and to verify that those data elements matched a real identity. To perform this verification, banks leaned on the credit bureaus, which packaged their “credit header data” (the identity data found at the top of every credit file) into fraud/KYC services, available via APIs.
This was a simple solve for the banks (the signals were static and binary, which meant fraud teams only had to deal with perhaps a dozen attributes total) and a lucrative business for the credit bureaus. However, it also turned out to be an easy safeguard for fraudsters to bypass.
Fraudsters figured out that the mere act of applying for credit would, if no matching file was found at the credit bureaus, trigger the creation of a new credit file (using the supplied identity information). With a bit of patience and a few small tradelines, fraudsters could build the histories within those new credit files, which, effectively, allowed them to seed entire synthetic identities into the databases that banks and the government relied on to keep the bad guys out.
By 2016, it was estimated that this “synthetic identity fraud” was costing U.S. lenders $6 billion, accounting for 20% of all credit losses. By 2020, that number had jumped up to $20 billion.
And that’s when banks adapted their game plan.
Instead of relying solely on a deep, but narrow source of truth (credit header data), banks layered multiple sources of truth on top of each other to create a more complex (and thus harder to fake or manipulate) approach to satisfying their KYC requirements. These sources of truth included ID document capture and verification, phone-carrier history and reputation, device ID and reputation, behavioral biometrics, and consortium velocity and history.
The result was a much richer and more nuanced portrait of the consumer, based on thousands of attributes and scored using sophisticated gradient-boosting / deep-graph machine learning models. Fraudsters could, in theory, still spoof every layer of this portrait, but the cost became significantly higher: they needed aged phone numbers, domain-reputation histories, high-quality fake ID docs, device farms that never repeat a keystroke pattern, and a cash-flow signature indistinguishable from a legitimate paycheck.
Most gave up or shifted their focus to softer targets.
And banks learned a valuable lesson — it doesn’t matter how smart your risk model is if the dataset your model is built on is narrow and/or easily manipulated.
We’ve learned this lesson in fraud prevention.
We are about to learn it in credit risk underwriting.
Credit Risk Underwriting
Historically, credit files have been rich and incredibly reliable sources of insight into consumers’ creditworthiness; 7+ years of tradelines, inquiries, and public record information, analyzed using thousands of carefully constructed attributes.
However, recently, the core credit files have come under stress, making lenders’ primary job — evaluating and pricing the risk of default — much more challenging.
First, new categories of lenders and loans have emerged, providing competition for traditional products such as credit cards and payday loans. The trouble with these new products (BNPL, earned wage access, etc.) is that the lenders offering them are, for the most part, not furnishing repayment data to the credit bureaus. This is an issue that the bureaus have been trying to address, but until they build sufficient data coverage in these product categories, lenders relying solely on credit bureau data will have a blind spot.
Second, we have credit builder products. I’ve written (ranted?) a lot about credit builder products in the newsletter, so I won’t go overboard here. Suffice it to say that products like secured credit builder cards (offered by neobanks like Chime, Varo, and Current) allow the providers to report revolving credit tradelines to the credit bureaus while taking essentially zero risk themselves. Is this fraudulent? No. Does it warp the credit risk signals that lenders have traditionally relied on? It absolutely does.
And oh, by the way, generative AI has the potential to make this credit builder problem a thousand times worse. Imagine an AI agent that a consumer can give natural language instructions to (I need a 5% auto loan in two months) that then backward engineers a path to get there and autonomously takes actions on behalf of the consumer (including making payments and opening and closing accounts) to get them the score increase they need. This isn’t science fiction. There are fintech companies building that type of agentic credit-building service right now.
Third, and finally, we have credit data disruptions that are the intentional or unintentional result of public policy decisions. During the pandemic, federal student-loan payments were paused for more than three years. Roughly 38 million borrowers stopped accruing negative amortization; their credit files showed current status even though no actual cash was leaving their checking accounts. A recent Boston Fed study found a measurable lift in average credit scores for that cohort, a lift that vanished once payments resumed in late 2023 and early 2024.
Bottom line — if lenders keep relying on credit bureau data alone, they risk repeating the fraud story: a single, venerable data source gradually bends under manipulation until its predictive power fractures.
Fraud teams solved their problem by adding signals that are hard to counterfeit in bulk: device history, behavioral biometrics, and consortium velocity. Credit teams need an analogous upgrade, and the obvious candidate is consumer-permissioned bank-transaction data, also known as cash flow data.
Cash Flow Data
The thing that’s important to understand about cash flow data is that it covers a much much broader range of financial activity than traditional credit bureau data.
The analysis of credit bureau data has, over the last 50 years, led to the development of thousands of attributes.
In just the last 10 years, cash flow data has produced tens of thousands of attributes. These attributes allow lenders to measure an enormous range of financial situations and behaviors, including:
- Income — sources, volatility, trends over time, etc.
- Expenses — discretionary vs. non-discretionary, categories of spend, etc.
- Debt Obligations — including those that don’t show up in core credit files, such as BNPL, EWA, and payday loans.
- Stability — income, expenses, balances, trends over time, etc.
- Risk — excessive gambling, frequent account shortfalls, signs of financial abuse, etc.
The reason this breadth of insight is important is twofold.
First, it adds a depth of understanding beyond what even the best traditional credit bureau data attributes and scoring models can discern. A borrower who ends each month with a significant deposit surplus and whose largest debit is rent behaves very differently from one who burns through paychecks in four days and relies on overdraft grace periods to make groceries. Yet these two consumers can share the same 680 FICO if they manage revolving balances with equal discipline. Cash flow data allows lenders to parse these important differences.
Second, cash flow data is difficult for fraudsters to fake or for desperate consumers or motivated public policymakers to manipulate. Cash flow attributes and scores are derived from ground-level banking activity — the money that moves in and out of KYC’d bank accounts. It can’t be faked or manipulated without spending a tremendous amount of time and effort (opening and funding accounts, making large recurring transactions, etc.), which destroys the ROI for the party making the attempt.
Time for a Layered Approach
If history is any guide, layered credit intelligence will go from optional to expected faster than most lenders plan for.
Fraud moved from header-file checks to orchestrated, multi-signal platforms in about six years. The same market forces — loss pressure, competitive differentiation, regulatory encouragement — are lining up behind cash-flow-augmented credit scoring right now.
This doesn’t mean credit bureau data will lose relevance. On the contrary, bureau tradelines remain the best long-run record of repayment morality, just as the header file remains the place you ultimately look to confirm a consumer’s name and SSN. But everything, in risk, is a partial view.
The game is no longer won by perfecting a single swing; it is won by reading the whole court, anticipating where the next shot will land, and meeting the ball with the right angle of attack.