mindstalk | Entries tagged with math

I'm reading How To Measure Anything and it's had some surprising revelations.

Rule of Five: if you take 5 random samples from a population, there's a 93.75% chance the population median is within the minimum and maximum (range) of the sample. For it not to be, all 5 samples would have to be e.g. below the median. They're random, so the chance of that is 0.5**5 = 3.125%. They could also be above the median, so that's another 3.125%.

The same math gives that a sample set of 3 has a 75% chance of bracketing the population median! A set of 7, 98.4%.

Single Sample Majority: This one's a bit trickier, but say you have a bunch of urns, containing red or green marbles, the urns have a completely uniform distribution. If you draw a single marble from each urn, and bet that the majority of each urn matches the corresponding sample, you'll be right 75% of the time. Bayes:

p(urn|draw) = p(draw|urn) * p(urn)/p(draw)

Uniform, so p(urn) = p(draw)

Uniform, so p(draw|urn) = 0.75. For individual urns it varies: if the urn is 51% red, 51% chance of drawing a red; if 95% red, 95% chance of a red draw, but the urns range over all the percentages.

Honestly this one seems less generally useful than the Rule of Five, but it's still impressive -- if you don't know much, even a single sample can be meaningful.

Crossposts: https://mindstalk.livejournal.com/556856.html

I mentioned a couple years ago that I'd found proofs of them; I wasn't clear that proofs weren't part of my trig education along with sin(A+B), though lots of other identities were derived from addition and Pythagorean formulas.

The law of sine proof is easy and stayed with me. I don't recall my cosine proof (it was all in my head, no notes) and the one I came up with recently feels simpler (I recall grousing about grinding algebra) and different. Also like it would horrify the people who struggle with the concept of variables in algebra.

My drawing program sucked; Ca and Cb are supposed to refer to the two parts of angle C.

Law of Sines:

sin(A) = h/b
sin(B) = h/a
a*sin(B) = h = b*sin(A)
sin(A)/a = sin(B)/b
Voila.

Law of Cosines:

This has less obvious motivation, IMO; I like proofs that have a direction to them, rather than just happening to end up at your goal. But you take what you can get.

c^2 = (ca+cb)^2 = ca^2 + cb^2 + 2*ca*cb

ca^2 = b^2 - h^2, likewise cb^2 = a^2 - h^2

But also ca = b*sin(Ca) and cb = a*sin(Cb)

This is where the variable-strugglers would hate it, because we use *both* substitutions at once.

c^2 = b^2 - h^2 + a^2 - h^2 + 2*a*b*sin(Ca)*sin(Cb)

c^2 = a^2 + b^2 - 2*h^2 + 2*a*b*sin(Ca)*sin(Cb)

And let's do it again! h = b*cos(Ca) = a*cos(Cb)
h^2 = h*h and we'll use different substitutions for each h:

c^2 = a^2 + b^2 - 2*b*cos(Ca)*a*cos(Cb) + 2*a*b*sin(Ca)*sin(Cb)

Using cosine addition,

c^2 = a^2 + b^2 - 2*a*b*cos(Ca+Cb)

c^2 = a^2 + b^2 - 2*a*b*cos(C)

Voila!

Technically these proofs should be redone for obtuse angle triangles, to make sure they work when the perpendicular h is outside the triangle, but I haven't felt like being that rigorous.

Crossposts: https://mindstalk.livejournal.com/551758.html

I'm reading Judea Pearl's _The Book of Why_ (Caltech book club), about causal inference and types of causation. Just got through necessary vs. sufficient causation. It seems to me that captures part of the gun debate: widespread gun ownership is necessary for frequent mass shootings (no guns, no shootings) but not sufficient (we can imagine lots of guns without shootings... like most of the US's own history).

Gun control advocates point out that taking the guns away will stop the shootings; gun rights advocates argue guns aren't really to blame.

Pearl's own illustrative example is a house fire: a match (or some other ignition) is both necessary and sufficient, oxygen in the air is necessary (no oxygen no fire) but not sufficient (just adding oxygen doesn't start a fire). That includes an assumption that oxygen is 'normal', present whether or not the match is (thus enabling the match to be sufficient, because we can assume the oxygen is there).

Are guns like oxygen? For a lot of the US, yes, in being ubiquitous, considered normal, and mostly not killing people. Removing oxygen to prevent fires is usually overkill, and they would say removing guns is too.

Other modern crises:

* greenhouse gases are reasonably necessary and sufficient for global warming. (No gases, no warming; adding greenhouses gases to an 1800 AD background suffices to cause warming.) (As a side note, Pearl points out that climate models allow climate scientists to generate counterfactual 'data' quite easily, by tweaking model parameters.)

* cars and deaths: cars are necessary for 40,000 dead Americans a year (no cars, no such deaths... alternative transport modes aren't nearly so dangerous) and sufficient (adding cars, or rather a car-oriented and -dependent culture[1] is the main reason we have the deaths).

[1] Japan actually has 3/4 as many cars per capita as the US, but 1/3 as many road fatalities per capita; the cars are used less, as well as being smaller and slower in common use. Many fewer deaths per vehicle too, but similar deaths per vehicle-km.

Crossposts: https://mindstalk.livejournal.com/537423.html

I forgot about the laws of sine and cosine. They're easy to prove once you know them, though I don't know about discovering them from scratch.

I was reminded after the previous post that you can get another triangle with just a fair bit of algebra. sin 2*18 = cos 3*18, solve for sin 18; you'll get a quadratic, and a fairly simple solution (-1+sqrt(5))/2. (If you try sin 18 = cos 4*18, you get a quartic, and I don't know how to solve it analytically.)

I've been writing code to approximate log or powers via iterated square roots. It works, though takes a fair number of steps.

Crossposts: http://mindstalk.livejournal.com/487328.html

One virtue of my math education was that it was fairly proof heavy; I feel I can demonstrate or prove most of what I know. I may have mentioned this not extending to sin(A+B), which I had to figure out later as an adult. It also didn't include detailed "how do we calculate this shit?" for stuff like trig, log, fractional exponents. Once you hit calculus the answer becomes "Taylor series or something better", but before then?

Feynman talks about it somewhat in Chapter 22 of his Lectures, which is a cool read; he goes from defining multiplication in terms of addition, to numerically suggesting Euler's law, and showing how to build up log tables along the way. It's a heady ride. But he doesn't talk about trig tables.

I've long had my own thoughts about those, and finally wrote some code to try out the calculations.

Basic idea is that given sin(a) = y/r, cos(a) = x/r, and the Pythagorean theorem, there are two right triangles for which we know the values precisely. The 45-45 is trivial, while the 30-60 one can be gotten from sin(60) = cos(30) = sin(30+30) = 2*sin(30)*cos(30). From there, you can apply the half-angle formula as often as you please, to get very small angles, at which point you'll notice that sin(x) ~= (approximately equal to) x for small x. Thus you can approximate an angle like 1 degree which you can't otherwise[1] get to, then use double and addition formulas to build back up to say 10 degrees or anything else.

The question in my mind always was, how good is that? And the code finally answers the question. Sticking to half-angle stuff is *very* accurate, matching the built-in function to 15 decimal places. Using a sin(1 degree) approximation and building up to sin(15 degrees) is accurate to 4 places; starting from 0.125 degrees instead is accurate to 6 places. I don't know how practically good that is; one part in a million sounds pretty good for pre-modern needs -- like, at that point can you make or measure anything that precisely? -- but Feynman says that Briggs in 1620 calculated log tables to 16 decimal places.

[ETA: hmm, I'd been assuming the built-in function, presumably based on power series, was the most accurate version, but maybe I should view it as deviating from the exact roots of the half-angle approach. Or both as having small errors from the 'true' value. Especially as Python's sin(30) returns 0.49999999999999994, not 0.5. Of course, both approaches are using floating point. sin(60) = sqrt(3)/2 to the last digit, but cos(30) is slightly different.]

[1] The two exact triangles basically give you pi/2 and pi/3, and those divided by 2 as much as you want. But you can't get to pi/5 exactly.
a/2 + b/3 = 1/5 #a and b integers
15a + 10b = 6
5(3a+2b) = 6
(note that 2a+3b=5 is totally solvable.)

Though I guess a more valid approach would be
c(1/2)^a + d/3*(1/2)^b = 1/5 #all variables integers
c*5*3*2^b + d*5*2^a = 3*2^(a+b)
5(c*3*2^b + d*2^a) = 3*2^(a+b)
which still isn't soluble.

Likewise for getting to pi/9 (20 degrees):
c(1/2)^a + d/3*(1/2)^b = 1/9
c*9*2^b + d*3*2^a = 2^(a+b)
3(c*3*2^b + d*2^a) = 2^(a+b)

Crossposts: http://mindstalk.livejournal.com/486780.html

Say you're a regular commuter, taking transit at least twice a workday. 10 trips, which would cost $22.50 if you're using a CharlieCard. A 7 day pass is $21.25, so it totally makes sense to buy one, then ride the T whenever you want. Even if you somehow had a 4 day workweek, having a couple more trips would be likely.

Four 7 day passes would be $85; a monthly pass is $84.50. So that makes sense too. Or does it? Say you have three weeks of vacation, and leave town for them; maybe you'd save money by just cycling 7 day passes, and skipping the weeks you're gone.

I approached the math from a couple different angles, but this presentation seems best: a month pass costs about the same as 4 weeks, so 12 monthly passes covers the year for the cost of 48 weekly passes. Even if you skip 3 weeks, you'd still have to buy 49 passes... plus covering that extra day (or two, if leap) in the year. So go monthly!

Though, having been using 7 day passes, I noticed that they actually shuffle forward. If I buy one on Monday morning, the next Monday I can leave a bit earlier and still use it, buying (or activating) my next pass Monday evening. And so on. The effect is that you end up covering 30 days for the cost of 4 passes, as each one picks up an extra "half-day" commute. And if you shuffled into buying a pass on a weekend, well, maybe you could skip travel that day and save an extra day.

Of course, there's a week's worth of 31 day months, so there's that -- you're not quite getting a month's worth for 4 passes.

It's nice doing estimations in my head, but at some point you have to turn to a calculator for precision. A year's worth of monthly passes is $1014. If you cover 30 days with 4 weekly passes, that's $85 per 'month', and $1020 to cover 360 days, with 5 more days to finagle. OTOH, if you can skip 3 weeks, you'd spend just $956.14 in a year, saving $57.75. Or $42.57, if you threw in 5/7 of another pass for the extra days.

Of course, that assumes you can maintain the shuffle. Weekends offer skipping a day, but a regular weekend thing might pin you down. Say I activate a pass at 8pm Sunday to go to Grendel's; the next week I might leave earlier, but I'd still have to activate a new one at 11:30 to get home. The week after that I could leave Grendel's a bit earlier, activating the next pass on Monday morning... okay, it still works, though Sunday feels a bit sticky due to the short 'commute'.

Of course, the monthly pass means not having to buy stuff every week, nor worry once a week about the timing of when you do things. OTOH, saving $40 to 60... well, it's not a ton, but it's not trivial either; 40/1014 is 4%.

Extra thought: if you really use the weekends on your one-week vacations, you could save another 2 days each, or 6 days total, in effect skipping another week.

As for me, if I had today off I'd probably just go monthly. Annoyingly, I probably have 4 or 5 trips to make today. Cash today and monthly tomorrow, or weekly today?

***

Meanwhile, the $12 daily pass is hard to justify unless you run around a lot. Even for a tourist spending $2.75 per trip via CharlieTicket, it costs more than 4 trips -- though if you're doing train/bus transfers that becomes a lot easier to justify, since the Tickets don't give a free transfer. But even then you'd d need bus/train, bus/train, and one more trip. For a Card user you'd need to make 6 independent trips to make a day pass economical. Most likely use case would be having to make multiple quick trips along a train line.

Crossposts: http://mindstalk.livejournal.com/478714.html

When I was a kid, I learned -- probably in The Gentle Art of Mathematics or some such book -- how to convert numerals from one base to another. Divide n by newbase, record the remainder, replace n by the quotient, repeat until quotient = 0, then write the remainders in reverse order.

This manifestly worked, if you did it, but always felt backwards to me, doubly so -- once for the reverse order thing, and once for the fact that you're only using remainders, not quotients, or so it seemed. Plus, though it gave the right answer, I didn't have much sense of why it worked.

Well, I was thinking about it early this morning in lieu of sleeping again, as I do, and came up with a method[1] that's slow but does both use quotients and yield the digits in MSB (most significant 'bit' first, descending significance). And then I thought about the old remainder method again, and it made more sense[2], and I also realized a key mismatch: the whole cycle of dividing by newbase calls up long division, may even involve long division if newbase is multiple digits in oldbase, and long division gives its answer in MSB. The remainder method chews up the number LSB, and gives the newbase numeral's digits LSB, so it's not backwards at all by its own lights, but it is LSB instead of MSB. Works fine if you write the units first and work to the left.

[1] Trial and error: multiply by increasing powers of newbase until the quotient is zero; backoff to the preceding power, and the resulting quotient is the first digit (MSB) of your answer. Repeat for the remainder. So (base 10 to base 10, for simplicity), n=742: find that 742//1000 = 0, 742//100 = 7, so 7, then repeat for 42. A subtlety is that numeral-based long division may not work, e.g. if you're converting a binary numeral to base ten and dividing by '1010': binary quotients are only 0 or 1. You'd have to go back to basics, division as "how many times can I subtract b from a?"

[2] 10 to 10 again: n=742, divide by 10 and get 2, that's your units; now shift right (or use the quotient) to get 74, divide by 10, get 4, which is your tens digit, and so on. Thinking about the relation between dividing by 2, and shifting the binary representation of a number to the right, made it all clear.

Dunno if I'm making it clear. I think the core insight for me is simply "base conversion felt backwards because you felt it should go in the same order as long division, but it doesn't and is doing something else. Also "put the remainders aside and write in reverse order" is less clear than "write the remainders as you find them, but right to left."

Crossposts: http://mindstalk.livejournal.com/473337.html

A simple problem I'd never had occasion to think much before, before I saw a sample coding problem.

How to rotate the elements of an N-element array by k spaces? An obvious way is to shuffle it one space, k times, but that's slow, O(N*k). Faster, which I saw about as soon as I thought about performance, is to use a second array B, where B[(i+k)%N] = A[i]. But that takes O(N) extra space. Can you do better?

Yes, as I realized at 5am. For each element A[i], move it to A[(i+k)%N]. O(N) time, O(1) extra space. Can't beat that!

Except some mildly tricky math intervenes: the natural approach only works if N and k are relatively prime. A more general solution is

let g = gcd(N,k)
for i in [0,g)
  for j in [0, N/g)
    shuffle element k spaces

Despite looking like a double loop, it's still O(N), with g*N/g iterations.

[2020 Edit: Looking back... I find this less than clear. I went to the code to make sure I got it right.

#ar = (array)
g = gcd(N, k)
for i in range(g):
    cur = ar[i]
    for j in range(1, N//g+1):
        temp = ar[(i + j*k) % N]
        ar[(i + j*k) % N] = cur
        cur = temp

]

I've also learned that C++ has a gcd function now. std::experimental::gcd, in the header experimental/numeric. C++17 moves it out of experimental but even g++ isn't doing that by default.

The really annoying this is that this is the sort of solution that comes naturally to me lying in bed, with little conscious effort, but that I'd likely fail to get in a whiteboard session or timed coding test, due to stress dropping my working IQ.

Crossposts: https://mindstalk.livejournal.com/470472.html

So there are the left and right Riemann sums, and the much better midpoint Riemann sum. Recently I wondered about integrals that took the average of the endpoint of each strip: (f(x)+f(x+step))/2. The thing is that most of the halves combine, so you can just add up f(x) for each whole step, plus f(start)/2 and f(end)/2. How's that compare?

Much better than the left and right sums, but not quite as good as the standard midpoint one. E.g. the integrals of sin(x) over 0 to pi/2 are

left_: 0.9992143962198378
right: 1.0007851925466327
mid__: 1.0000001028083885
mine_: 0.9999997943832352

All the other integrals I tried show a similar pattern: x, x^2, x^3, 1/x, e(x)... the two are close, but midpoint is just a bit closer to the correct answer. Or looked at another way, has close to 1/2 the error... hmm, that factor is consistent too. I should look into that.

Or: if I just recalled my terminology correctly, midpoint Riemann sums have half the error of trapezoidal Riemann sums. Which is not what I would have expected.

Crossposts: http://mindstalk.livejournal.com/454580.html

I've never forgotten how to make a Taylor series, but I did forget why or where it came from. I thought about it a bit in terms of increasingly accurate approximations with derivatives, then gave up and looked it up.

"Assume a function has a power series in (x-x0), then plug in x=x0 and take derivatives to find the coefficients."

Hmmph. It certainly works, but feels kind of out of thin air.

As for what they're for, my first thought was "deriving wacky identities for pi and e". More seriously, calculating transcendental functions. But I'd not learned one twist on the latter: if you have a nasty integral, you can expand the integrand as a Taylor series, integrate *that*, and voila, you can calculate the integral.

***

As hinted at before, I like revisiting fundamentals. Feynman's lectures chapter 1-22 is like the bible of this, where he starts from natural numbers, goes through making log tables, then evaluating complex exponentials, and finally getting to Euler's formula through sheer calculation. But there are basics he doesn't cover, like evaluating trig functions without Taylor series. The obvious thing is to work from known angles with the half-angle and angle-addition formulas. But where do the raw values come from? How do we know what angles go with what side length ratios?

Well, the isosceles right triangle is entirely determined by its name, that one's trivial. But how do we know that 30-60-90 goes with 1-sqrt(3)-2? I came up with one way: cos(30)=sin(60)=2*sin(30)*cos(30) -> sin(30) = 1/2.

I learned just the other day that sin(18) has a fairly simple algebraic value. But if you try a similar approach to the above, cos(18)=sin(72), you end up with a cubic equation in sin(18). Bleah! Instead there's a geometrical approach, which I can't describe well without diagrams, but it starts with assuming there's an isosceles triangle such that bisecting one of the equal angles yields a similar sub-triangle. That quickly gives you a 36-72-72 triangle; more cleverness yields that the side/base is phi (golden ratio), and so sin(18) = 1/(2phi).

Speaking of which, the proof that phi is irrational is simple and neat. The golden ratio is defined by n/m = m/(n-m), n>m. Assume it's a rational, so that m and n are integers, and n/m has been reduced to lowest terms. But the golden ratio definition requires there be a yet smaller m/(n-m), contradicting the lowest terms assumption... There's a related geometrical proof, where you start with a golden rectangle with integer sides, cut out a square with integer sides, and keep going... you run out of integers.

Crossposts: http://mindstalk.livejournal.com/452391.html

This was something covered in pre-calculus, way back in 7th grade. Like most of precalc, it didn't leave a deep conscious impression on me. Or much of any impression, in this case. In my new hobby of revisiting math fundamentals, largely in bed, I tried to figure it out in my head, got annoyed, and looked it up. The formula isn't bad, though not one I was working toward (it uses Ax+By+C=0 lines, I was using y=mx+b), but the proofs I saw were pretty grotty. I came up with a new one I like more.

Simplify! Skipping the *really* simple stages, of horizontal and vertical lines, what's the distance from a line to the origin? Combining the two representations, the line is y=-Ax/B -C/B. A perpendicular line from the origin is y=Bx/A. Via algebra, he point of intersection is x=CA/(A^2+B^2), y=CB/(A^2+B^2). The distance from that to the origin is |C|/sqrt(A^2+B^2). (Adding |absolute value| because distances must be positive, also this C is actually a square root of C^2 from the Euclidean distance formula, so of course pick the positive root.)

Now, given a line and some other point (X,Y), we just have to translate the system so the point coincides with the origin. This turns the line into A(x-(-X))+B(y-(-Y))+C= Ax+AX+By+BY+C=Ax+By+(AX+BY+C)=0 Applying the previous result, the distance is |AX+BY+C|/sqrt(A^2+B^2). Voila!

I need a math icon.

Crossposts: http://mindstalk.livejournal.com/452294.html

The classical/medieval trivium consisted of grammar, logic, and rhetoric. The quadrivium consisted of arithmetic, geometry, music, and astronomy. I always joked that when you're stuck with Roman numerals, doing arithmetical calculations is an advanced subject. In my recent re-dipping into number theory, I have learned that arithmetic was synonymous with number theory, or vice versa, and many number theoretic proofs are part of Euclid's elements. So the subject of the quadrivium may have been more advanced than I thought, if not more useful. (Pity the manor whose lord made plans based on the properties of perfect numbers.)

(How does one do calculations with Roman numerals? "Use an abacus", I assume.)

Crossposts: http://mindstalk.livejournal.com/451452.html

In number theory there is Euler's φ(n) (or totient function), henceforth phi(n) because easier to type, which gives the number of numbers (sorry) less than n which are relatively prime to n. So for n=6, phi(n)=2, because only 1 and 5 are relatively prime to 6, {2 3 4} all sharing a divisor. Confusingly, 1 is both a divisor of 6 and considered relatively prime to it, with a g.c.d(1,6)=1.

The formula for phi is elegant: if n=p^a*q^b*r^c... p, q, r being distinct prime factors, then phi(n) = n * (1 - 1/p) * (1 - 1/q) * (1 - 1/r)...
So for n=6, 6*(1/2)*(2/3)=2
n=12, 12*(1/2)*(2/3)=4, {1 5 7 11}

Davenport (The Higher Arithmetic) gives a proof based on the Chinese remainder theorem, which I think I understood but have forgotten. Andrews (Number Theory) gives a different proof based on Moebius numbers, which has so far resisted my casual and sleep-deprived browsing.

But looked at another way, it seems obvious! That way is the Sieve of Eratosthenes, one of the few episodes I distinctly remember from elementary school math. The formula represents striking out all the multiples of p (including p), then striking from what is left all the multiples of q, and so on. All remaining numbers (including 1, heh) will share no prime factors with n.

So for n=6:
{1 2 3 4 5 6}
{1 3 5} 1/2 numbers remaining
{1 5} 2/3 of those remaining

That doesn't feel like a traditionally rigorous proof, but it also seems pretty straightforward and without obvious holes.

(If you multiply out the terms, you get something you can see as subtracting from the count the multiples of p, q, etc., then adding back in for the multiples of pq, pr, qr, etc. that were subtracted multiple times, and so on, but that seems less obviously correct than the first approach.)

Crossposts: http://mindstalk.livejournal.com/451219.html

Lightly microwaved cherry tomatoes explode in warm sweetness when you eat them with pasta.

El Goonish Shive is a good webcomic.

A Miracle of Science is still a good webcomic, and unlike EGS it's long over.

A Borrowed Voice is a surprisingly good crack-premise Tolkien fanfic.

A bunch of new Madoka AMVs, which I've added to my list. I'll link to just one. Warning: spoilers for series and Rebellion.

Years ago, I proved the sin(A+B) identity from first principles while lying in bed. I think it took 40 minutes. The impressive part is that I'm usually more of a symbolic/numerical thinker than a visual one, I still slide my fingers to manipulate supply and demand curves, so doing finicky geometry in my head, no paper, was pretty impressive. Last night I thought about it again (and again in bed), and solved it much faster; I think I found a simpler solution, though I can't be sure. Alas, the margins of this blog post... or rather, I've never invested effort in learning how to make pictures on line.

The Renaissance art hallway at the MFA was more interesting than I expected, especially in the half that's largely maiolica.

Crossposts: http://mindstalk.livejournal.com/448270.html

I bought this popular math book at B&N, for my quasi-niece G2, for lack of anything obviously better on the shelf. I read it before turning it over to her, and it was fun; I think I would have liked it at her age (12) though I'm not sure if she will. Of course, it was less eye-openingi for me now: I've taken classes on Bayesian statistics, I've given talks on voting systems, I've read about the file drawer problem or the exponential stockbroker scam. Still, I did learn some things.

A big one was what he calls "don't talk about percentages of things that could be negative." More specifically, say the US adds 18,000 jobs one month, and Wisconsin adds 9,000 jobs. Does that mean Wisconsin was responsible for 50% of US job growth? The governor of WI would like to say so. But say that California added 30,000 jobs -- does that mean it was responsible for 166% of US job growth? Uh... The trick being that the US number is a net sum of positive and negative (say Texas lost 21,000 jobs) numbers; taking percentages of that is meaningless.

Another example: it's said that the top 1% have taken 93% of US income growth. Sounds pretty bad. But the next 9% might (I don't recall the numbers) might have taken 20% of income growth. Uh oh... we can balance this by actual reductions of income in the bottom 90%. So this is better and worse: more people are doing well, but the rest are actually falling behind, not just standing still. But the "middle class" would probably be less outraged by learning that they were also doing well.

Berkson's Fallacy was something I'd vaguely heard about, but forgotten. It's explained well at the link (by him, even), but the short version is that two independent variables can look negatively correlated if you select for either of them. Like, if you notice nice people or hot people, you'll find that of the people you notice, many hot people are jerks. But this needn't mean hot people are actually prone to jerkiness, just that you ignore plain jerks.

You can extend that to any two variables of desirable things. Niceness and richness, niceness and political agreeableness... say I put up with people if they're personally agreeable or politically agreeable; this will lead to my thinking that my political allies are unfortunately prone to being jerks, when it's more that I ignore opponents who are jerks.

A couple of geometrical statistics I hadn't know: correlation being the same as elliptical eccentricity, and Pearson correlation being the cosine of the angle between two vectors. Simple that way, ugly as an algebraic formula of the components.

Also, correlation is not transitive, the way that Dad and Mom are both related to Baby but not each other. Portfolio 1 might be IBM and Apple, correlated with P2 which is Apple and Honda, but not with P3, Honda and GM.

Reminders of the ubiquity of regression to the mean, and of the variance of small populations, are always useful.

Slime molds apparently suffer a voting paradox. They like oats and dislike light, it makes sense that you can balance them between a big pile of oats in the dark and a bigger pile under a UV lamp. It makes less sense that if you add a small pile of oats in the dark, the mold starts going for the big pile in the dark.

'hazard' comes from the Arabic for dice.

He talks a bit about the damage done by the cult of genius, when those magical moments of insight and revelation take a lot of work beforehand to prepare your brain.

Proof advice: try to prove it by day, disprove it by night. (More applicable if you don't know if it's true or false.) Might get insight into why it has to be true, or find out you were wrong all along.

Condorcet wanted the Rights of Man for women, Hilbert refused to endorse the Kaiser in WWI, and defended giving Emmy Noether a position.

Crossposts: http://mindstalk.livejournal.com/440992.html

I'm reading an interesting book _Democracy and Knowledge_ on how Athens worked, and it reminded me of http://en.wikipedia.org/wiki/Condorcet%27s_jury_theorem which seems relevant to the functioning of direct democracy.

Imagine a bunch of people trying to decide on some binary decision: is X true? is Y good policy? Assume they will sincerely and independently vote for the truth or best policy as they see it. Assume they each have a slight bias toward being correct -- 51%, say, or even 0.501 probability.

Then it turns out that a vote of them can be really accurate. If p is the chance of being correct, w is the votes for the winning options, and l is the votes for the losing option, then for a particular vote gap w-l, the chance of the truth winning is C(w,l)p^w*(1-p)^l, while the truth losing is C(w,l)p^l*(1-p)^w. If you divide, it simplifies to [p/(1-p)]^(w-l)

So for p=0.51 and w-l=20, you get a ratio of 2.23, i.e. the truth is more than twice as likely to win. At w-l=100 the ratio is 54. At 1000, it's 2e17. Note this is independent of the total population size.

Even if p=0.501, you approximately just need 10x bigger gaps, e.g. w-l=1000 has a ratio of 55.

So if the assumptions hold, voting on everything should be awesome, assuming you throw out cases that win by a tiny sliver of votes. How do the assumptions hold up in the real world?

Modern juries are too small, at least for such marginal individual accuracies, though if p=0.6 it's better, ratio of 130. Worse, they're not independent -- a unanimous decision doesn't mean 12 people being convinced outright by the evidence, it means them deciding after groupthink discussion and pressure to not return a hung decision and the desire to go home. To invoke this effect you'd want Athenian juries, with 501 or 1001 or more people.

Popular opinion at large isn't independent.

The math works with everyone having a slight bias to the truth; if most people are totally clueless and there's a scattering of experts, I'm not sure if the theorem holds so well. Though Wikipedia says "One very strong version of the theorem requires only that the average of the individual competence levels of the voters (i.e. the average of their individual probabilities of deciding correctly) is slightly greater than half.[3] This version of the theorem does not require voter independence, but takes into account the degree to which votes may be correlated." No time to look at the PDFs.

Some issues like quantum mechanics and much of economics are outright counterintuitive, so people may be systematically wrong on them, with p<0.5. This could be fought with better education, though you have to get the majority to admit it's wrong and needs to be educated...

You might believe that p<0.5 on everything... though that wipes out democracy except as an empty legitimacy ritual.

Partisan voters may not really count, so the p>0.5 assumption only applies to the smaller pool of swing voters.

Votes on issues are often not on fact nor even on policy by a set criterion. E.g. the Swiss ban on minarets is bad policy for general utilitarianism or human rights, but perhaps a 'good' policy for "protecting Swiss culture" or passive-aggressively discouraging Muslim immigrants. There's no clear correctness here, just preferences.

Most 'democracies' of course don't vote on issues, just on leaders. The theorem suggests people would at least be picking the best out of two leaders. Ignoring charisma, money, and candidate height(!) as disruptive factors, and assuming people *do* elect the most expert candidate, what does that mean? If an expert is right 80% of the time, and the theorem applies, the expert would easily be outperformed by direct voting. Even a 99% correct expert would be outperformed, though the difference may not matter much.

With local representatives and government voters may well choose the apparent best legislator for their district, or leaders for their city, but this needn't aggregate to good national outcomes, as the representatives fight for local interests as the expense of global ones.

Of course, when the theorem doesn't apply for various reasons, what does that mean for representative democracy? You need the theorem to apply for evaluating the gestalt state of affairs to pick a leader, or at least to reject the current leader as failing, while being inapplicable on individual issues. Not just that the first problem is easier. If it does apply where p(reject bad leader) is 0.75, and p(decide good issues) is 0.51, then you're more likely to pick a good leader than to decide any particular issue, but per above, you may not be able to pick a leader who's as good as your collective decisions on issues. And that's leaving aside things like the principal-agent problem, of whether *any* leader will do what you want once given power...

Crossposts: http://mindstalk.livejournal.com/385518.html

http://qz.com/139453/theres-one-key-difference-between-kids-who-excel-at-math-and-those-who-dont/

Crossposts: http://mindstalk.livejournal.com/384470.html

"You have a balance scale but no weights. You have 12 coins, identical except for one which is lighter or heavier than the others. Identify the defective coin -- find it, and whether it is lighter or heavier, in only 3 (three) weighings."

It took me a day and a half to rediscover the solution, which I'd known before, despite having a half-remembered clue bouncing around my head. So either it's really hard or I was being particularly dense. I'm not going to give any answers or even open clues, but I thought of a series of problems that might make it easier.

Note: sometimes you could get lucky and find the coin in fewer weighings, but we're looking for the number of weighings which guarantees that you'll find the coin.

*** One coin heavier, all others normal

3 coins, one of them heavier than the others. Find it in 1 weighing.

4 coins, one heavier. Find it in 2 weighings.

In fact, you can use 2 weighings for up to 9 coins, all pretty much the same way. Do so.

How high can you go to find a heavy coin in 3 weighings?

*** One coin defective (heavier or lighter), all others normal

2 coins, one defective. Understand why you can't identify it.

3 coins, call them A B and C. A is known to be normal, and either B is light or C is heavy. Identify the defective coin in 1 weighing.

3 coins. A is normal, one of the others is defective. ID it in 2 weighings.

2 normal coins, plus 2 other coins, one defective. ID it in 2 weighings. Find two different methods.

2 normal coins, plus 3, one defective. ID it in 2 weighings. I know only one solution.

4 coins, one defective; ID it in 3 weighings.

5 coins, one defective; ID it in 3 weighings.

Find an alternate method for the 4 and 5 coin cases. This will probably be very hard but more elegant when you find it. If you think your method is already elegant, kudos. This will come in handy near the end.

4 normal coins, plus 4 coins one defective. ID it in 2 weighings.

That's all the prep work; you should then be able to find 3 weighing solutions for from 6 to 12 coins. Unlike the "one coin heavier" family the method isn't just the same method scaled up. I think 6 and 7 are similar but different from 4 and 5 (either method), then 8 needs another trick, 9 needs another, 10 combines 8 and 9, 11 has another trick, and 12 combines everything. But if you did all the prep problems, you've found the tricks, just need to put them together right.

Enjoy!

Crossposts: http://mindstalk.livejournal.com/381737.html

In grad school Douglas Hofstadter -- yes, that one -- was my advisor, and I took several of his courses, most of which were only tangentially related to what you'd think of as cognitive science, and just forget about computers. The last one was "Mind and the Atom", which I mostly remember as reading a bunch of early philosophers, then the resurrection of atomic ideas, with Doug sometimes asking "how did they believe this stuff?" partly as incredulity, partly as an actual cognitive question: how did a bright guy like Thales even *think* of "everything is made of water?" I don't remember any answers to that. I remember trying to memorize the periodic table, and reading about how horrible the 14th century was, though that might have been my abusing my eee in class.

Also we had to give skits at one point, from the perspective of various philosophers, with Democritus et al. probably off limits due to their being right and where's the fun in that? Anyway I think I went with Heraclitus.

Or, no, this is why I keep a journal; we had an atomist group and anti-atomist. Everyone I knew was in the atomist group, but it was bigger than the anti one, so I felt I had to join the latter. So maybe someone did have Democritus.

Anyway, we had to write a paper for the end of the class. Probably no assigned topic. I couldn't think of one and procrastinated and procrastinated. I should have gone to him and asked for a topic or something, but I didn't, I don't know why. Probably massive loads of not giving a damn -- not about him, but about grad school in general at that point, and certainly not *grades*, who cares about the grades of some 7th year PhD student, beyond "why are you even taking classes?" Maybe I was depressed again, like senior year at Caltech? I have no idea, now. At any rate I managed to put it off to the night before it was due... and beyond. Had some vague idea of doing a "who was right" paper.

Finally, that morning, I had an idea: write a dialogue! Of, like, the various philosophers arguing it out equipped with modern science knowledge about which of them was most right. I think my main motivation was laziness. See, I've always been really good at writing essays -- short one, anyway, not many long in my life ever -- but I do need a topic, and it does take a bit of effort. Can't write as fast as I can physically type, unless I've really rehearsed the ideas in my head. And even then I'd often thing i had something worked out in my head, then freeze once I sat down at a keyboard.

But I do a lot of thinking as imaginary conversations, me talking to someone, real or imagined, and I've had a fair number of insights in those -- the explanation effect or teaching effect at work, probably -- and more of note here, it just *flows*. Those never block. Repeat, maybe, but not block. So I figured I could set them up as imaginary people, have them talk, transcribe it, and that'd be my paper. Might not be great, but it'd be something to turn in. Easier to write, and more *fun*, and something I'd never done before -- a challenge! Which probably shouldn't be combined with last-minute papers, but eh, it's worked before, as in the papers I wrote for some Caltech humanities classes while affected by the verbose and ornate style affected by Steven Brust through the character, or rather, fictional author, of Paarfi, in The Phoenix Guards, following in the style and structure of The Three Musketeers, or at least of whatever English translations of that book that Brust had read and, obviously, enjoyed.

And of course Doug had done all those Achilles-Tortoise dialogues in Goëdel, Escher Bach, but I just figured that meant he wouldn't reject the paper out of hand based on format, not that I'd get a better grade for it. And of course I had the idea for such dialogues from that book, hanging around in the back of my mind for years to spring.

So I finally start writing around the time it's due, blowing off the last class so I can do the paper, and yeah, it flows out fairly well onto the screen, and a few hours later I'm done and can turn it in. (Ooh, I recorded that; 3600 words in 3.5 hours, including proofreading and distractions. Probably not perfect flow, I can type 30 wpm, or 1800 words per hour. (Also, Firefox spell-checker knows 'wpm', but not 'spellchecker'.)) You can see the result here (10 page PDF); I cleaned up some typos and LaTeX gave it a new date, but otherwise it's the same.

And it *was* fun to write, and probably a lot easier than writing some more normal essay on the same topic would have been, with a topic of some sort, unless I kept the conclusion a surprise, and digesting the various philosophers in my own words, instead of what felt like their own, even though it was all really my own, but there's a cognitive difference in imagining other people, or in imagining yourself as other people -- you can fix some cognitive errors in the lab just by asking them to "think like a trader", and I've found that telling myself "think like a Buddhist" causes a calmness and "think like a happy person" makes me smile.

Also, I just realized tonight that arguably this paper is fanfic or real person fic of an odd sorts, and thus the only example of such I've actually written down, at least that's longer than a paragraph. First fanfic, arguing philosphers, go me. Literally self-insert, too, since I'm in the dialogue.

Anyway.

His opinion: "Some of the best work I've seen from you." and a high grade. Which I'd had before, outside of the ambigrams class I sucked at, so I don't think I cleared a low bar.

My reaction to that: "This... this is not helping." I've thought for a long time that I keep procrastinating because I keep getting away with it; I can't easily think of any "yeah, I got really burned by putting it off" incidents, except maybe all of grad school itself (main discovery: I suck at self-motivation or choosing my own topics), while lots of getting away with it, or even being serendipitously helped by the lateness, where doing the right thing wouldn't have been as good. ("Wow, this book would be the perfect gift for John, and it's only 50 cents! Good thing I didn't give him something lamer yesterday.") Certainly wrote lots of short essays the period before class in high school, to As, and also to a 5/5 on both AP English exams, so it's not just that my teachers were coddling me. But nothing so long, so late. Best work? WTF.

I re-read it, and it still doesn't seem *bad*. Great, I dunno, I don't think I can have that thought about myself, I just have "sucks", "okay", and "other people tell me it's good". Unless I'm specifically re-writing someone else's words or ideas, then I can say "better than *that*". Done a fair bit of that online, at short scales, plus giving an intro quantum computing talk after someone else's intro quantum computing talk because I thought I could do it better. Influenced by Hofstadter, actually, and his emphases on examples, clarity, and examples [sic].

My other fluid writing mode is "stream of consciousness", you just got a full barrel of it.

***

For comparison, here's my first paper for one of his classes (12 page PDF), his Group Theory Visualized course, where i seem to have gone for a high clarity regurgitation of some of what we'd learned, especially first very basic group theory and then what automorphisms and such were, those being a new concept I'd struggled with in the class. On re-reading, I thought I'd done fine -- very clear and even funny -- until I hit section 5. Footnote 5 is opaque, I switched from phi to f for no reason, I think the endomorphism paragraph makes sense but it needs to be much clearer, and the sample mappings table is a great idea but doing something weird at the end. What the hell is *g? I think I figured it out to something sensible, but I shouldn't have had to, that violate the purpose of the paper... Section 7 should show the math, at least if I'm aiming for a naive audience (what audience was I aiming for? No idea, now.) Section 9 uses standard notation for some groups, not the notation I'd built up in the paper. Section 10 says "It may help to recall what factor groups are" when I'd never mentioned them before. Maybe the paper was aimed by my classmates, or just at my teacher? That may make more sense, trying to explain one of our topics. Section 11... proves that.

That was disappointing. I went from "this is great!" to "no, this is flawed but I can fix that" to "what's going on here and why do I care?" Maybe whipping out my best work wasn't such a high bar.

I've known elementary group theory since 7th grade, I suppose it makes sense that I was able to be really clear for the first half.

***

Also surprising: I had barely any tags appropriate for this, just 'philosophy' and 'math'. I guess I rarely LJed about grad school or classes?

Crossposts: http://mindstalk.livejournal.com/345917.html

Yeah, I'm behind on posting news links and such. Moving on!

Given 11 people, what is the probability that 3 of the 11 share the same birthday? Assuming uniform distribution of birthdays in the population, of course.

This seems a lot harder than the classic problem of finding whether at least two have the same birthday. I debunked a couple of answers, came up with a nice one of my own for "at least 3 share, and I don't care about anyone else" which came nowhere near simulation, then someone else came up with an answer for "3 people share a birthday, and the other 8 don't share any birthdays" which matched debugged simulation, and that one's actually fairly easy in retrospect (in my defense, it wasn't the problem I'd been thinking about.) I still don't know why I'm so far off for the "at least" case.

Profile

mindstalk

Home page

May 2025

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Page Summary

Active Entries

1: In which a dog attack gets me 40 dollars and maybe delayed trauma
2: Life by candle-light
3: some meal costs vs alternatives
4: stainless steel convert
5: One mask forward, two masks back
6: welp, SWG Discord
7: packing philosophy
8: overshoes followup
9: rain shoe covers
10: canned tuna adventure

Expand Cut Tags

No cut tags

Syndicate

Style Credit

Base style: Abstractia by chiming
Theme: White Lace by nornoriel

Page generated 2025-05-22 18:45

Rich and Strange Aeons

Entries tagged with math

The power of small samples.

laws of sine and cosine

guns and causation

more trig!

trig approximations

MBTA passes math

Base conversion mysteries explained!!1!

fast array rotation

numeric integrals

random math stuff

Distance from a point to a line

non-trivial arithmetic

Euler phi

Discoveries

book: How not to be wrong, Jordan Ellenberg

Democracy and the jury theorem

Math talent vs. effort

Balance weight puzzles

My last and first papers for Hofstadter

Probability challenge