Stats, Money, and NYC

4,455 words

You'll only receive email when 2938 publishes a new post

Tax Professionals in NYC

I once had a job making $35,000 a year. 35K is a ton of money, but when you live in NYC it's only a moderately large amount of money. My employer was running what looked from my angle like a tax scam because they wouldn't give us ANY tax documents (W2, 1080-MISC, anything). I try to explain this to people and they don't understand at all. My employer had a complicated reason why this was completely legal.

After 5 years of complaints (no one knew which box to put "$35,000" into in turbo tax), they rounded up all 100 employees and brought their tax lawyer in to help us. It was the most bizarre meeting I've ever been in. For some reason, the tax lawyer was unable or unwilling to give us any tax advice. All we wanted to know was which box to put $35000 into each year when we do taxes. Half of us had been audited because our taxes looked fucked.

The tax lawyer gave a high level overview of taxes and benefits and laws and never would explicitely address our situation. If anyone asked a question about our specific situation he would say "that's a question you'll have to ask your tax professional".

Then a near riot broke out. The tax lawyer was going on vaguely about "taxable benefits" and started talking about the special circumstances when health insurance was a taxable benefit. It was just meant to be a stock slide about taxes. Collectively, everyone in the audience realized that his slide was telling us that our special employement status meant our health insurance was a taxable benefit. And no one had been paying taxes on this benefit for the last five years.

So there's one hundred people in a room that make poverty line money, just realizing they owe tens of thousands of dollars in back taxes, and every time they try to get an honest answer out of a millionaire, they're told to talk to a tax professional who would surely charge $200 an hour. It got really ugly fast.

Second Amendment in NYC

I've lived at my new place for 6 months. I live on the ground floor. In this time, there have been 6 people shot 50 feet from my window. Dudes are literally just hang out on the corner all day playing dominos, dice, and waiting to shoot people. It's really not as glamorous as it appears in the snoop dogg videos I've seen. The dice still looks fun, though.

There's so much weird shit in my neighborhood.

If I go on a walk on a Saturday morning (7-8 AM), The streets are lined with dudes with their backs up against the storefronts just staring at the street. Completely burnt up. Probably about 10-15 dudes per block. I'll try to take a pic of it next weekend if I remember.

The local park has an outdoor gym. Where people live in tents, blast music, and workout all day.

You can only buy lottery tickets through a bulletproof window.

CVS only conducts business with morally upright customers.

The hardware store has no supplies, but has 10 employees huddled around a 1990s TV watching VHSs.

I can't make this stuff up.

Fourth Amendment in NYC

Every day you get on the train in NYC, they remind you that the police can look through your belongings for no reason. It's a nice way to get you frustrated before you even get to work.

This type of thing is what really bothers me about political parties. The Democrats do a bunch of terrible things, but at least they are supposed to be the the party that protects the middle of the bill of rights. So when they control NY and completely disregard the middle of the middle of the bill of rights (along with the typical amendments they disregard), it makes me wonder what they are doing at all.

At least they have the arrogance to get on the loud speaker every morning and yell "backpacks are subject to random inspection by the police". The Republicans usually try to pretend they are defending their amendments which frustrates me even more.

Fishing in NYC

You can get nearly anywhere in the city on public transportation within an hour. However, last Saturday I was fishing somewhere that took 4 hours to get to even though it was only a 30 minute drive from my home.

When I got there I met Mike, a herring fishing expert. Most people that fish are completely burnt up, so there was nothing out of the ordinary with Mike talking like a crazy person.

Mike sold me some lures, helped me tie them, and showed me how to jig for herring.

Then he wanted to lower his lamp into the water and tie the lamp's rope to the railing. I held the rope while he tied it. Over the next four hours he kept wanting to move the lamp to different spots. Each time he started brushing against my hand more and more and then touching and then resting. It got weird.

It's midnight and I'm 4 hours from home. "I'm going to head home". Mike offers me a ride home in his white van that doesn't have any windows. I think I'll take the bus.

Sexual Selection. Or as Christians calls it, "Falling in Love"

The first obstacles (to any reasonable idea):

  1. Euphemists. "I mean merely that short words startle them, while long words soothe them. And they are utterly incapable of translating the one into the other, however obviously they mean the same thing."

  2. Casuists. " Suppose I say "I dislike this spread of Cannibalism in the West End restaurants." Somebody is sure to say "Well, after all, Queen Eleanor when she sucked blood from her husband's arm was a cannibal.""

  3. Autocrats. "They are those who give us generally to understand that every modern reform will "work" all right, because they will be there to see"

  4. Precedenters. "The Church is expressly bound to meekness and charity; and therefore cannot be cruel"

  5. Endeavourers. "as if one had a right to dragoon and enslave one's fellow citizens as a kind of chemical experiment"

Wilderness in NYC

"For maybe 99 percent of human history, a few million years, humans were hunters. They didn't get up and go to work each morning. That started with civilization and civilization is nothing but a heart beat of recent time. Ten thousand years at the most, and to hell with that. I want to wake up naked and alone in the desert. I want to eat sand, and drink piss, and pass out screaming from sun burns and spider bites. But I know it won't work, and I know it won't happen."

-- Scott Carrier

People pretend there's great wilderness in NYC. It's not true.

I have three options for fishing this weekend:

  1. Take a 2 hour bus to go to a spot that should be good for catching herring.

  2. Take a 2 hour train to use the only legal winter trout fishing river.

  3. Take a 45 min subway to the beach.

I won't catch any fish with any of these options. For bass fishing season, there's tons of places in NYC. Tons of ponds with half pound bass that are completely dormant.

Table not found in Hive

Hive causes a lot more pain than it should. My favorite examples are when it tells you 'Table not found'. Usually that means Hive isn't being very smart.

Try to create a table then take 10 rows from it:

CREATE TABLE listed2938.TABLE2 as
SELECT * FROM listed2938.TABLE1;

SELECT  * FROM listed2988.TABLE2 LIMIT 10;

...Table not found 'TABLE2'.

Then you spend forever at the command line trying to figure out why it can't find TABLE2, even though you are sure TABLE2 exists. After 5-10 minutes you realize the error should have been Database not found 'listed2988'. Or Table not found 'listed2988.TABLE2'. If it can't find the database, why is it explaining that it can't find the table?

That one is just stupid. This next one is a bug that really irritates me.

You can't use [IF NOT EXISTS] if you are trying to overwrite a partition without first using the database. That sentence probably doesn't make sense because it seems so ridiculous for that to be that case.

SELECT * FROM listed2938.TABLE2;

USE listed2938;
SELECT * FROM listed2938.TABLE2;

The first query gives Table not found 'TABLE1'. I do not even know what the problem its having is. Maybe it's trying to use the default database?

Maybe this will be a part one of one-thousand series on how irritating hive can be. I've got all sorts of problems.

Smoking in NYC

When you step into a subway train and see someone sleeping on the seats, you quickly smell the train to figure out if you need to change cars before the doors close. 80% of the time, it's a homeless guy who has spread a disasterous odor through the entire car.

Yesterday, I got on, saw a guy sleeping, smelled nothing, and sat down across from him.

Two minutes later, he kind of rumbled around, pulled a cigarette out of his pocket, lit it and pulled. Not even sat up or opened his eyes. 2 minutes of smoking. Then he dropped his cigarette and lighter on the floor accidentally because he fell back asleep.

It was so crazy this guy was smoking on the train. We all looked at each other like "are you seeing this".

Then I realized how strange it was that it was strange. Homeless dudes do so much crazy stuff on the train, but smoking it outside of their range. Sip beer, drop it, and spill it all over the floor. Pee on the floor. Ejaculate into whatever they want. Yell at someone. Yell at no one. Yell at nothing. Vomit. But no one would expect them to smoke. Why not?

How Copulas Work

You can simulate a correlated bivariate gaussian distribution easily:

And the marginal distributions are gaussian:

But what if you want a correlated joint distribution where the marginals are whatever distribution you can think of?

You can take your correlated bivariate gaussian sample and feed it through the gaussian CDF function. The resulting distribution is uniform in each dimension but the two uniform distributions are still correlated. To reiterate: you start with a sample that is gaussian in dimension 1, and gaussian in dimension 2, and you end up with a sample that is uniform in dimension 1 and uniform in dimension 2 (but the two uniform dimensions retain the correlation that the original correlated bivariate gaussian sample had.

Here's the uniform joint distribution for the original gaussian sample:

This is an intermediate step. The marginal uniform sample that is jointly correlated is the copula.

And the marginals are uniform:

Then you feed the joint uniform sample through the inverse CDF of whatever distributions you care about. Here I'm using the gamma and beta distributions, but honestly, whatever you want. The resulting joint distribution retains the original correlation from your bivariate gaussian sample:

And the marginals are beta and gamma:

Summary: Correlated joint gaussian -> Intermediate correlated joint uniform -> correlated whatever you want.

The only technical part here is why does the CDF trick work? There's a proof here, but the visual in figure 1 has a good intuitive explanation:


Sexually Oriented Businesses in NYC

Giuliani had no love of the 14th amendment. He made these hilarious laws regarding sexually oriented businesses that still exist today. If you ever see a sexually oriented business in NYC go in and look around.

There's this bizarre law about how only 40% of your floorspace can be devoted to adult content/activities. So if you go into a sexually oriented business, they have huge amounts of floorspace that has nothing to do with their business. Most strip clubs just have huge areas that you can't access.

I went in a adult video/toy store today and saw a hilarious take on the law. They had >60% of their floorspace devoted to family friendly VHS. Ninja Mutant Ninja Turtles on VHS, Shakespeare in Love on VHS, etc. I'm talking two giant rooms full of VHS that they had no intention on selling. Also, why turn the heat on? Customers have no interest in this section so it was about 30 degrees. Also, why turn the lights on? The rooms were more or less pitch black. I soaked in it all in for as long as I could handle the owner staring at me.

Then they had their viewing booths (more on these in a minute). Then they had a tiny room of adult DVDs and toys. I am not exagerating when I say they had 20 times the number of family friendly VHS compared to adult DVDs.

But back to the viewing booth. I didn't go in today, but I went in one once. It was remarkable. I went into an adult shop that I thought was empty so I asked the guy behind the counter what the deal with the viewing booth was.

"What do you mean?"

"What goes on in the booth?"

"You pay 5 dollars and then you get 8 minutes of a TV that has 25 channels."

"Ok, I'll try."

And then I got out my credit card to pay him (this is where I showed him that I wasn't pretending. I had no idea what I was doing). He told me you pay inside the booth. There is a machine attached to the TV and it only takes cash. I dug around in my pockets and found four dollars.

"I only have four dollars, dang"

"You know what? (pulls a dollar out of the register). Here (hands me a dollar)"


I start walking over to the booths. Theres about 10 booths, five on each side of a hallway. Before I made it to the booths, he says "Here, try this one" and points to the booth behind him. This really freaked me out. Why did he want me to go in a specific booth? I saw someone was using one in the back (occupied light was on), pretty far from the one he suggested. This had me even more on edge because I thought the place was empty. I walked towards the back past the door he suggested, but he insisted. "Really, try this one". Alright. Just trying to stay on the bull for my first rodeo.

I assumed the booths would be private. I put my five dollars in the machine and flipped through some channels. Everything looked like it was filmed in the late 80s or early 90s. Who would want this? On top of that, the room wasn't even private. There was a hole in the wall facing the next booth over. Then it all came together. That's a glory hole. People don't use these booths to watch adult films. The guy behind the counter knew I was just curious and he steered me away from accidentally stumbling into a booth next to another dude.

Peak 2014

Transportation in NYC

Above is how much we spend each month on transportation.

Every day I get on the subway, sardined in with too many people. The guy next to me coughs in his hand and then rubs it all over the pole. The guy behind me coughs. The guy laying on the bench taking up 300% too much room has fresh urine all over his pants. Three stops later the first white people get on. The coughing intensifies and so does the number of $4.00 fancy drinks in cardboard cups.

Each conductor has their own method of dealing with passengers that try to cram into the train even though they don't fit and end up delaying the train becuase the conductor can't shut the door. My two favorite conductors do the following:

  1. If the delinquent passenger ends up getting on the train, the conductor mashes the "Please don't hold the door open" PSA button the whole way until the next stop and makes the passenger listen to it over and over while everyone glares at him.

  2. The conductor gets on the mic "WHAT UP WHAT UP NYC. IF YOU HOLD THE DOOR OPEN IMMA SIT HERE UNTIL YOU EXIT THE CAR." Then we sit there for another minute until the delinquent passenger realizes the conductor ain't messin around.

Bookstores in NYC

I had three hours to kill before meeting up with someone after work. In order, how I spent my time:

  1. Subway ride (20 min)
  2. Bought and ate Candy (5 min)
  3. Strip Club (20 min)
  4. Starbucks (90 min)
  5. Bookstore (45 min)

When I got to the bookstore it was 7:15PM in LIC. The bookstore was 2 floors, giant space. Rent must have been $8K/month. They had 3 employees ($22K/month). They closed a total of $0 in sales while I was there, and it wasn't looking like it was going to pick up any time in the next decade. But at least the place looked really hip.

I needed to know how this place could be such a disaster so I talked to the guy behind the counter.

"Is this place new?"

"{long winded way of saying we've been here for a year}"

"Oh. Is it not going well?"

"People seem really happy that we are here"

I know that is word-for-word what his response was because it's been stuck in my head. Oh, great. People are happy you are here.

NBA Markov Chains

I'm working on a way to simulate NBA games where I simplifying all the conditional probabilities by making Markov chain assumption that the probability of each event is conditional only on the previous event. This seems like a really nice compromise between disregarding conditional probabilities altogether and overfitting probability estimates that are conditional on the previous 100 events.

Also, I think very few basketball fans think the event probabilities depend on much more than the previous event.

Well, also conditioned on the players on the court. And the coach?

I'm going to use a baysian approach to estimate the conditional probabilities, but I don't think the prior is too important because there are so many events if you only condition on the previous event.

So far, all I've made so far is a giant database of every possession since 2017 and every detail about the possession. I will email it to you if you ask.

Percent chance of rain

Google's weather service is ambiguously frustrating.

I'm going to be outside tomorrow from 1 pm until 6 pm. What is the chance it rains while I am outside? Google says 20% chance it rains at 1pm, 30% chance it rains at 3pm and 20% chance at at 5 pm. What does that mean for my question?

Do you kind of just average all of those and say there is a 23% chance it rains while I'm outside? Do you say each of those is the probability it rains in the given window, and each window is independent? Then the probability it rains is very very different. The probability it rains in at least one of those windows under this regime is 55%. The assumption that the window probabilities are independent seems definitely wrong, so what are the window-conditional probabilities? Everytime I see these window percentages, this goes through my head.

Brothels in NYC

I used to live in a residential neighbor in Queens. I needed to go to a welding shop and Google said there was one just down the street from my place in a direction I've never walked. I went there and the neighbor was entirely different. It was an industrial area with semi trucks lining the street.

I was there at five in the morning and there was a building with an open sign that I was 20% sure was a brothel. I decided to check it out. When a 45 year old woman in a sparkly and revealing gymnastic outfit answered the door, I was 100% sure it was a brothel.

"how much does it cost?"

"80 dollars for an hour. Plus tip"

"Ok I'll go get cash and be right back"

I never came back but now I can detect brothels everywhere in NYC. I think everyone walks right past them because they aren't looking for them. I usually go in and ask how much. It's always 80 dollars for an hour.

Video Poker in NYC

Gambling is illegal in NYC basically. You can read my previous post on how much I love the 14th amendment if you want to know how I feel about it.

For some reason, all the bars in Queens have a video poker machine in the back. They always say something like "Does not dispense money". But who's paying video poker for fun? Does the bar owner give you tab credit if you win? Cash? How does it work.

One time I thought I would give it a go. I was in a towny bar, with a bunch of townies at the bar. I got up to the machine and started fiddling around my wallet for money. I never got to figure out how it worked because someone came running from the bar and said "whoa whoa whoa I'm in the middle of playing on this machine."

I still don't understand what happened there.

Ranking the Bill of Rights


Do I only not care about 3 becuase it's such a non-issue. If the DNC or GOP started attacking 3 all the sudden, would I really care about it?

I wish 14 was in the Bill of Rights. Of all the amendments, it makes me feel the best thinking about.

My friend born outside this country likes 5 the best. His amateur opinion is that it is what makes our government so special. I say: what good is 5 if you don't have 14 to protect you from the state government. What good is any of this if you don't have any protection from your state government?

Parametric T-SNE

The original (non-parametric) TSNE paper has almost 6000 citations and everyone uses it. One year after the paper came out, the same author wrote a second paper describing parametric TSNE where he trains a neural network to minimize the TSNE loss. This paper only has <200 citations and no one knows about it even though parametric TSNE is much more useful.

You train the NN once and then you can embed an arbitraty number of data points. This is what everyone actually wants to do.

How come everyone uses non-parametric TSNE? I think it's because everyone uses scikit learn which only has non-parametric TSNE.

>>> import numpy as np
>>> from sklearn.manifold import TSNE
>>> X = np.array([[0, 0, 0], [0, 1, 1], [1, 0, 1], [1, 1, 1]])
>>> X_embedded = TSNE(n_components=2).fit_transform(X)
>>> X_embedded.shape
(4, 2)

Nothing is easier than fit_transform. The only (good) non-parametric TSNE implementation I could find is in matlab. Yikes.

My main question with parametric TSNE is if data moves a tiny amount in the original space, is it guaranteed to only move a tiny amount in the embedded space? That seems important for a lot of applications (like tracking changes over time).

Drug Addict

My eyes are completely sunken in. Every time I wash my hands after using the bathroom at work, I look in the mirror and think to myself "I should go out there and do the best job I can. If I lose this job, it's going to be really hard finding a new job looking like a drug addict".

Then I go into the same thoughts about how looking like a drug addict runs in my family. Strangely, being a drug addict also runs in my family, but those two things aren't perfectly correlated I guess.