We are a commune of inquiring, skeptical, politically centrist, capitalist, anglophile, traditionalist New England Yankee humans, humanoids, and animals with many interests beyond and above politics. Each of us has had a high-school education (or GED), but all had ADD so didn't pay attention very well, especially the dogs. Each one of us does "try my best to be just like I am," and none of us enjoys working for others, including for Maggie, from whom we receive neither a nickel nor a dime. Freedom from nags, cranks, government, do-gooders, control-freaks and idiots is all that we ask for.
Our Recent Essays Behind the Front Page
Monday, December 5. 2016
Data and Risk
Validation is always welcome. It's great to see someone pick up on your writing and think "I am glad I was able to add to the discussion." I believe this holds when a piece is shared on a site opposing what you've written. I'm not interested in an echo chamber.
Twenty months after writing this post on data, I received notification of its inclusion on another site. Upon reading, one might be inclined to believe I'm not a fan of data. Not true, I just don't put my full faith in everything as it is presented, or simply because it's presented, to me.
Since my post, 20 months have passed and nothing has changed. In fact the 2016 election was an example of organizations simply accepting data, becoming reliant on it, while few questioned its value. The data left me, and many others, inclined to believe Hillary would win. At the same time, it left me angry about how it was presented in a "See? We have more information and you don't know what's really going on" manner. The day of the election, however, the long lines I saw (in New York City) left me with the impression the data may not be telling the whole story. If Hillary voters in a safe city were turning out in droves, I came to the conclusion turnout would be high across the board, and high turnout usually coincides with a desire for change. The data itself may not be 'wrong' but whoever was using it was doing so improperly.
In the article linked to my post, there is a brief discussion of large banks using data to avoid lending in poor neighborhoods, thus limiting potential losses on their loan portfolio. This policy, known as 'redlining', is often a source of anger and has yielded policies designed to punish firms which do this. It's my view firms shouldn't be punished for doing this. Mainly because these firms are more likely to take outsized risks in good neighborhoods - and face potentially larger losses in those areas. On the other hand, presumed lost opportunities in poor neighborhoods will find a willing participant for profit. It takes time. It takes money. But more importantly, it takes an understanding that risk is the critical part of earning a profit. And it is the word RISK which makes the difference, and shifts the discussion from one of data to something else altogether.
Data, as currently over-used, is designed to reduce risk. If you reduce risk, you also reduce the likely margin of profit. You can increase your opportunity to be profitable. The margins on that profitability may be lower than if you'd rolled the dice in a few areas the data told you to avoid because they are poor risks. Large firms, looking to keep risk low and profit as high as possible, enjoy the benefits which data can provide to avoid potentially bad loans. These benefits carry with them the potential lost opportunity of larger profits from taking on higher risk.
Branch Rickey once said "Luck is the residue of design." Data is a key part of that design. But so is understanding risk and knowing how to manage and profit from it (though not necessarily reduce it).
Trackback specific URI for this entry
Display comments as (Linear | Threaded)
In order for Hillary to win, she would have had to not only break the glass ceiling, but overcome the “six foot tall” bias whereby all former presidents were six foot or taller, and the “three-term party” bias whereby it is rare for one party to win three terms, and the “Ohio state” bias whereby other than Kennedy in 1960, all presidents have had to win Ohio. Trump was always ahead in Ohio.
This is simple handicapping based on past data. She had an enormous task from the get go.
In 2008, the spin was that Republicans needed to be more open to minority groups, etc.. While that is true, the main reason they lost was that their turn was up, and after eight-year people wanted a change.
Another important thing to understand is that you don't really know what the data is telling you until you understand why the data is coming out that way. The 2008 mortgage securities collapse was a classic case of seeing the pattern in the data (rising prices) without getting the cause (inflation, not appreciation, in the sector) correct.
I agree with you about redlining. I understand the temptation to force banks to lend to people we think are deserving, but there's no substitute for making the person taking the risk the one to decide how much risk to take. Banking gets all fouled up, though, because the bank is loaning money that belongs to depositors, depositors want a federal guarantee that their money won't all be lost in a loan, and the government has leverage to step in and force banks to use their money (the depositors' money) to accomplish social justice instead of to generate a return that's commensurate with the real risk.
I'm guessing one of the reasons banks used redlining was not to completely reduce their risk - but because they probably felt they couldn't set the loan rates high enough to compensate them for taking that risk on.
If you expect default rates to be 3 times higher, than you need a much higher rate just to break even. That may not have been allowed - or could have also been used to punish the banks for discriminating.
Then again, as we found out in 08, banks can be way off on how much risk they are actually taking on without realizing it.
There are a lot of problems with data. First there is 'data' and there is 'DATA'. Sometimes the value or meaning of some data is magnified or at the least blinds us to other data. The media does this all the time. Right now anything Trump does is more important than almost anything else happening around the world. Another problem is simply that for various reasons some data comes in as a landslide while other data is a trickle. This can be bias, poor intelligence, inadequate sources/sensors, etc. Another problem is categorizing and evaluating the data. There is so much data that we use computers to sift through it. But computers aren't human and the humans who program them are prone to mistakes and biases. You will simply never know in a timely manner if your computer system makes mistakes or is biased (ex. the AGW computer models). Also people understandably get overwhelmed by data and begin to ignore it or see things they want to see. Data isn't knowledge. Computers aren't perfect. Sources are biased. There simply aren't enough grains of salt to deal with the shear volume of data.
Yet, our very lives depend on data and we better be paying attention.
There's a difference between using data to make a decision and implement policy (global warming, for example) and using data to fulfill obligations and expectations (how much coffee does Starbucks need for the next year).
There are limits to the amount of bias that take place in the latter, while there is a tremendous amount in the former.
Data is meaningful. I can make a relatively good guess on how much money my sales team is going to generate in the coming year based on information from past performance, and feedback from the current marketplace. However, if I decide that information is telling me I need to shift the focus on my salesforce from one revenue source to another - well, I'd better be sure the information is causative rather than correlative.
I'd also better be sure I'm not getting caught up in sunk costs - assuming that because I spent money in a certain way, I should continue down that path. That's a common mistake which I see made over and over and over. That's a terrible form of bias which crushes companies.
Ultimately, the biggest problem of data is the pretense of knowledge, which Hayek spoke about. Just having information doesn't mean you have the knowledge to use it properly. Timeliness is critical, so are logistical limitations. Personnel and staffing may play a role in making the decision different than you'd assume.
There is also the consideration of whether or not the data is reliable.
You're right, we better be paying attention. But we better be ready to question everything, rather than just accepting everything.