|February 8, 2019||Dataset used: 20,783 Pitchfork reviews
Nick James Scavo is a musician and writer based in New York City. He works in various solo and collaborative projects with The Actual School, contributes regular reviews and essays to Tiny Mix Tapes, and has presented work with MoMA PS1, Artists Space, The New Museum, Rhizome, Control Synthesizers and Electronic Devices, and in residence at NTS Radio. He is Communications Director and Assistant Curator at ISSUE Project Room.
Throughout this interview, the term “bell curve” is used loosely to refer to any unimodal distribution in which values around the median indicate middle-range scores of evaluation, rather than to refer to a symmetrical distribution.
Andrew Thompson: I think the place to start is just the graph of the distribution of Pitchfork’s scores overall to look at how Pitchfork tends to rate things, as a way to consider the idea of the Pitchfork rating and this system of evaluation.
So this is a histogram of how their scores are distributed. The average Pitchfork score is almost exactly 7, the median is 7.3. From your perspective, what is the Pitchfork rating system? What is its intent and its effect?
Nick Scavo: When you look at a model like this I can’t help but think of being graded on a bell curve. So there’s obviously an assumed educational power dynamic existing from the expert or the professor to the critic in assigning an actual grade to the performance in some way. I feel in the history of criticism, you don’t normally have an actual numerical value. For the most part, in the history of 20th Century Art, you have some kind of intellectual engagement. But with Pitchfork and publications like it at the beginning of the 21st Century, critics mirrored the content they were addressing by operating within industrialized conditions—they started assigning numerical value to aesthetic work. I think this might be relatively specific to the form of music—for example it’d be strange to see a quantified value assigned in visual art. Could you imagine someone grading a recent exhibition at MoMA?
AT: No but you would see and do see some attempt at quantification for film, and this is going back to the 90s—
NS: Like Roger Ebert—
AT: Right, thumbs up thumbs down, which has moved on to whatever Rotten Tomatoes aggregates into a meta review. With Pitchfork, what has always differentiated it from everyone like Roger Ebert to everyone else who uses these kinds of systems is the exactitude with which they purport to evaluate a work—that there is this decimetric precision to how good or bad something is. When you say something is a 7.7, you’re saying it’s not a 7.6 or a 7.8, it’s exactly this valuable. And there isn’t the vagueness of even a star system, like one to five stars. It’s like they’re trying to surpass any kind of interpretability of the score or any admission of generality that a three-star review might imply, and instead it’s conveying this scientism.
NS: It’s strange because although it has the appearance of exactitude or science, like the bell curve, it’s a much more relational type of number. The reason why a score is 7.7 instead of a 7.6 is because perhaps another album was given a 7.6, and we need to look at that album and assert that this album is better. So, in a way, it’s following a line and that line of is drawn in relation to a series of points developing off of each other — and that’s what you have here: a distinct distribution of music reviews. When you look back at a previous year and look back at all the Best New Musics of 2010 or whatever, you see a series of cultural values that Pitchfork’s writers have given, and then you see distinct numerical scores given to them. And it’s always strange because you’ll have someone give something a 10 but often times that’s not the number one album of the year. It’s a very loose, contestable, relational numerical situation where there’s almost a false or psuedo-quantification happening.
AT: Right, I mean, without a doubt it’s false. There’s also a cynical materialism to it too. I remember thinking that after I listened to Sote’s Sacred Horror of Design, which was one of the best things I heard last year and I think truly understood the idea of the artistic experience and like cracked the world open when I heard. And Pitchfork gave it…let me see what they gave it. They have it a 7.8. And the experience of this album was this completely nonmaterial one, right? To see whatever experience I had of this album reduced to a 7.8 felt very contradictory towards the spirit of music to me.
NS: I’d agree 100 percent. It’s a way of establishing power or authority, ultimately. It’s an attempt at objectivity that I don’t think is applicable in this context. I mean, it’s obvious that a Pitchfork review is one writer’s attempt at quantification. But, as a reader, we are tasked with responding to the quantification—do I agree with this or disagree with this? People obsess over this kind of situation. That’s why people are obsessed with editorials in general, or op-eds––they want to disagree with the author, and when they see a score, it’s an almost sensationalized provocation. They don’t even have to read anything. They just see this ridiculous number and just spit at it and then you’re on the internet at midnight waiting for the next Pitchfork review—I don’t really think this is the case anymore, but in its heyday, there was a time when I would literally refresh the page at midnight to see the new Pitchfork reviews for an album just to see what value was given to it. And that was the height of indie online internet blog tastemaking, 2008 and 2009, around that time. And I feel that’s been diffused a little bit out of the sheer multiplicity of information sources. When you look at this distribution, I think the safest thing you can do is give something a 7.8. It’s essentially just affirming the record––and you have to question the goals of writing a review in this way. In this case, I think it’s part of a PR cycle more than anything. An extremely low review or an extremely high review has a sensational quality that’s going to create some kind of rupture in the public discourse. If an album gets the Pitchfork 10 it’s this rare thing. It’s equally rare when records go below 3 or 2 or 1. I would ask this of the data: since maybe 2009, 2010, as music has become more industrialized, more ubiquitous in our consumption of it through things like Pitchfork, I would only guess that the actual number of mediocre, median/mean reviews has only gone up.
AT: Well, let’s have a look…The median has kind of moved around a bit [fig. 2, top], but the variance has actually gone down [fig. 2, bottom].
NS: It’s pretty wild. The freewheeling days of early internet music criticism.
AT: It’s gone down by…what is that…by more than 30 percent since 2001. So you’re right to an extent. Scores have begun to hug the middle more over time.
NS: I think what that shows is just that literally it has become less independent. When you had a larger deviation of scores, you had pretty distinct perspectives that were making extremely diverse assertions about music. You were going to have people veering all the way to the left and then veering all the way to the right, as opposed to more middle-of-the-road reviews, which I think is a kind of normalization, which mirrors every economic trend we’ve seen over the past 20 years— including the American political discourse.
AS: The relentless homogenization of everything.
NS: Not to use this word too freely, but it’s a kind of hypernormalization, this literal squeezing of the deviation into this closer assemblage. Things are closer together, they’re restricted and can’t move the way they once could.
AT: I mean on the one hand, looking at this original chart, there’s something to be said perhaps, maybe, that work does fall along this shape, that most of it is pretty mediocre: There’s a good amount of bad work, and then a small amount of it is truly great. So is there something to be said that this reflects the reality of art? That even if their numerical system is a false objectivity, that in that false objectivity it is making out this shape of how things kind of are?
NS: I think that’s a narrative we have to question. I think that’s a model that is dominant in the way we look at things, the bell curve, where the majority of work is mediocre and that there’s a lesser extent of very select, genius works. This is a myth we have to question a little bit because it is such a dominant one in western thinking and in capitalism. It is a model we have been programmed to look at so regularly. I would argue—I don’t believe in complete relativism in taste, but I would want to maybe have a less hierarchical and perhaps more distributed model for how we would think about our relationships with art and aesthetic experience that perhaps would look completely different than a bell curve.
AT: You could almost join these two charts together [fig. 1 and fig.2, bottom], and say that the reviews feel almost beholden to reflect this idea of a natural model. And there’s this idea of what I just said: Doesn’t this reflect nature? That the bell curve is natural. Obviously we’ve shown all kinds of problems in history by believing that everything necessarily fits into along bell curves. And the longer Pitchfork goes on, the more they feel they need to hew to this bull curve for its own sake, and that should they deviate from granting things a 7.3, a 7.4, that it is unnatural in some sense. And I think that’s a good idea: Maybe there are more works that if not genius—maybe we should just get rid of the idea of genius works altogether.
NS: Yeah, and I feel like there are so many experimental and critical models we employ to quantify our experience of listening to music that would be really exciting to talk about and think about. You could propose model after model that completely disrupts the bell curve. And you could use those models in entirely new ways in writing about music as well. The thing that’s kind of depressing about Pitchfork specifically, and how minute their scales get, is you have almost 20 years of history now of them using the same model over and over and over again. And I think you’re right in looking at that data and pulling out the kinds of assumptions that are in it. And when we do look at that data, what we see is the same depressing reality we see everywhere else in our world, which is that things are getting less distinct, more median, more auto-articulated, same after same 7.8 or 7.3 or 7.4. If you look at the Best New Musics over the last five years over the years prior, we have actually more market ready, market friendly Best New Music. So it’s actually performing an industry function. It’s given value for its ability to perform in a careerist way as it relates to what the music industry wants of it, which is to sell records, to have successful tours, to perform well on streaming platforms.
This might be hard to do because it’s subject to the economic flows of history, but I’m curious if the Best New Musics of the last five years have better profitability than the Best New Musics of the 90s or the early 2000s or that time.
AT: It’s hard to say by just looking at the reviews themselves without pulling in outside data, like tour revenues and album sales or streaming figures and that kind of thing. But what I’d found looking at this data earlier was that the things that Pitchfork gives 10s to tend to be these nostalgic look-backs.
NS: Oh yeah. Reissues.
AT: Right, retrospectives.
NS: Here, we have the idea that hindsight is 20/20. You get a more perfect vision when you’re able to look back on culture and say, that was the exact zeitgeist of the time. That record really did sum up X thing that happened or whatever. It’s much messier to actually figure out what’s important now, especially when it just gets obscure to the nth degree.
AT: It feels like a review is no longer taking a risk at that point either, right? If you voice an affirmation for an already established critical opinion, you both take no risk while also portraying yourself as the canonizer, kind of responsible for an album’s renown and its place in musical history.
The further back you go, the higher the scores. So by the time you get to 10 years, 20 years, things aren’t really…there’s something hagiographic about the past.
NS: One of the most memorable Pitchfork 10s I remember was Radiohead’s Kid A. That was 2000 or whatever, and at the time, when I was very young, I felt that record was undeniably a perfect 10. It was the tome of what a good kind of experimental music record could be, based on my education and music listening at the time. And as much as I thought it was this obscure, weird sounding album, in 2000, it’s not like Radiohead was a small band at this moment. It was this very branded decision for Pitchfork to give that record a 10. A lot of other publications, mass media music publications, were decrying and talking against their shift in sound from OK Computer. Everyone was acting surprised. But that was a shift for Pitchfork too, because I think their popularity rose exponentially after 2003 and into the late aughts.
I also wonder, what’s the distribution of adjectives that exists in music criticism text.
AT: You mean the number of adjectives?
NS: Yeah, like how much does the music critic depend on description of sound by using adjectives and how instrumental is the adjective to our understanding of what sound’s meaning is? There’s a tendency to narrate music, to describe it using descriptive words, like “rough” or using metaphors and like “dropping water on steel”.
Adjectives were identified using Python’s spaCy library and its largest English corpus. SpaCy is very good, but like all Natural Language Processing tools, it’s not perfect. A few minor errors exist in here, mostly in the tail of very infrequently used words, where words like “gyaaaah” were labeled as adjectives. A small number of words were also not properly identified as adjectives, among them “genius”, which the library unfortunately always identifies as a noun.
AT: As you see it, what’s the problem with these words and how should one write about music?
NS: With the practice of writing about music and regularly quantifying it and trying to substantiate that quantification, oftentimes we have to go to the actual text and look at the words that are used to give things meaning. So in order to support something’s 8.8 or whatever, we have to look at the adjectives that are used to celebrate them and establish its meaning in English. Oftentimes this language isn’t an actual defense of the music as being substantial in any way. This writing doesn’t contain political arguments asserting its meaning, or social commentary. We might get flashes of that every once in a while in reviews, but I would say the dominant method for defending the status of the quantifiable score is in this type of descriptive language, these flimsy words—you can just peel them off like paint chips, they just come right off, and what are you left with when peel all those words out? What are the actual claims being made in the music review itself?
When you look at these reviews, what is this person even trying to say? That this is some funky thing? This is like a funky piece of music? It’s really hard to locate what’s actually substantiated in the language, the quantity that’s represented as a number. That’s why I’m interested in adjectives specifically, or cliches, because I feel those are the first line of defense.
AT: So it’s almost like the quantification of the value has to justify itself, and almost by using these words, each word has a value in itself, and aggregated on top of all the others, that amounts to this mosaic that equals an 8.1 or whatever.
NS: Yeah, because as we’ve already said, the number embarasses the language. And the writer knows that and has to substantiate it in language. That’s the brief given to the writer—they have to substantiate it in writing, they can’t just give a number. One of things we see are a whole host of assumptions that reflect the basic assumption that’s made in the quantifying of the music itself. And you know, I just have ideological differences in all that. And we have to—I mean, I do that shit all the time too, I regularly use language like that to describe music.
AT: I use language like that all the time too. Like “stark”, I always use the word “stark”. Not with sound necessarily, but we use adjectives to describe things, like “ethereal”. If it’s “ethereal”, do you have to not say it’s ethereal?
NS: I don’t really have a problem with those words and in fact I do think they conjure certain images. Often, these words are relationally established words that we’ve come to understand very specifically—that’s one of the amazing parts of language is the way that it’s coded to have these specific intertextual meanings and when you pair them together they create these cool structures that are poetic and expressive and can get to the heart of things. The thing I’m trying to say is, the relationship between those words and the number is completely in question, and the tactics used to defend the number in language is something I would like to see further substantiated and I’d like to tease out the claims being made. Or it could just be pure poetry. And if it’s just pure poetry, that calls into question the use of the number even more.
Basically what you have to do is describe…it’s kind of, I’m sure in film there’s these tropes in writing. In film criticism, the film critics often say: “don’t just narrate the plot,” it’s frowned upon in film criticism. And in a lot of ways, music criticism doesn’t have that. Music criticism wants you to narrate the architecture of the album because it’s not evident most of the time. It’s actually just not apparent. It’s just a hallucinatory kind of thing to listen to music.
AT: I guess what I’ve always seen in Pitchfork’s language though is they keep running into the inability to do that, right? You have your own thoughts about writing about sound. I don’t think it’s impossible to write about sound, but I think it’s harder to write about sound than it is to write about most anything else.
NS: Definitely, and I think music’s hallucinatory quality makes us really dependent on certain language tropes and really makes us lean on specifically the adjective and this really descriptive language to give it substance. Essentially we’re carving narrative into sound automatically by using those things or we’re giving it something that it just doesn’t have—that being language—when we talk about it. Or we have to talk about it through metaphor or we have to give it social or political valence in order to do that. I think in the actual review text, there’s a very formal or structural things I’d want to be aware of. If we were to isolate certain paragraphs in what a paragraph is trying to do, say the review is seven paragraphs, you’d probably have three or four of those paragraphs just describing the album. You’d probably have one or two of those paragraphs giving some kind of social or political context, broader and then specific to that artist’s career or something. So you have these very specific critical agendas that are actually coded into individual paragraphs in a music review. This might be a larger data project, but almost like give those numbers or something.
As far as I’m concerned, if we look at that deviation thing [fig. 2, bottom], if we are getting more and more similarly scored reviews, I would also argue the actual shape of those reviews, the text of the review is becoming more and more alike. So basically we’re getting these regurgitated forms that follow very specific scripts that are giving the same scores to probably the same music in similar ways. So I mean, what’s the point at that juncture other than to affirm general PR campaigns? And it just kind of becomes existential at that point. It’s like, well, why do we even write about music? And that’s why I think model of this bell curve breaks. What is actually happening?
If you go back and read some of the earlier Pitchfork reviews, there were some truly weird writing. It was an experimental platform in some ways, and as things develop trajectories and the line gets drawn across a longer trajectories, you’re kind of only going to get more uniformity. It’s kind of what you were saying about The New Inquiry too. As a publishing platform, I think The New Inquiry was originally designed for some pretty rapturous writing or whatever— polemics, critiques, some outlet for a cultural underclass to freely write. But as it became an institution in itself, you had this thing about what it meant to write for The New Inquiry, and you had this shape or this form that you’re ultimately—it’s a representation of what it’s actually trying to do.
AT: This is just length of reviews.
During the interview, the standard deviation (bottom right) was used to measure the variance of article length. This was arguably a mistake, although arguably not: While the standard deviation is considered formally improper method for the measure of variance for non-normal distributions (like the right-skewed distributions of review length), it is considered so because of the weight it places on outliers—in this case, unusually long articles. The discussion surrounded the idea of a normalization of writing in Pitchfork. If that type of homogenization had occurred, it would essentially preclude the existence of outliers almost altogether. It’s possible that although even if formally improper, the standard deviation, given the weight it does place on outliers, is a valid measurement in this particular instance.
Still, the median absolute deviation is included in this array for clarity (bottom left). It shows no such increase in variance. Regardless, both graphs refute the hypothesis at hand—that Pitchfork’s articles have become more regular in size—and either statistic was capable of momentarily disrupting a stream of criticism and prompting a consideration of Pitchfork’s virtues.
NS: The length has gone up.
AT: Well the variance of length has gone up [fig. 4, bottom right]. The length over the years is kind of across the board [fig 4, top right], there’s no real pattern to it. And this is the distribution of review length just across all reviews [fig. 4, top left]. And this doesn’t tell you much, the fact that the median review is around like 700 words. But this is the variance of length. So there’s much larger variance than there used to be.
NS: Is that with reviews or features?
AS: It’s with everything that has a score. So at least in terms of the size of reviews, we might have to be a bit more charitable to them in that regard. And that kind of brings up this other question, which is, isn’t there something good about Pitchfork? At the end of the day, I would rather Pitchfork exist than Pitchfork not exist with nothing to replace it. It’s still a music publication owned by Conde Nast, doesn’t have to write about Sote at all. They could completely ignore that music entirely and there would be no major publication discussing them. So in that graph of seeing their variance go up, I want to take the opportunity to say something nice about them.
NS: It might just be that they’re publishing more so there’s more variance just based off of the scale of what they’re publishing. There’s all kinds of things that could be going on with that.
AS: But they’ve actually kind of leveled off [fig. 5]. Now it’s like 1200 reviews a year. I don’t know how that compares—
NS: That’s a lot. I mean that’s like a machine, that’s so much writing. Which is cool, I mean I like that fact of it. I think one of the things I want to highlight is that there are a series of assumptions that I think come with the model, and I think sites like Pitchforks that quantify the aesthetic experience in the way that they do it, I think that is the kind of narrative thing you’re talking about, how it matches certain economic models that exist in other fields. But I also think it exists in quantifiable lists and rankings of content as well, which kind of falls outside the purview of the review. But it is something I think is ultimately a capitalist way of thinking.
AT: And like the American university has been decimated by this system, right? My own university, Temple, their business school was in a rankings scandal for fabricating this data.
NS: Because they could attract more money by inflating these numbers.
AT: The whole ensemble of benefits that come with having higher rankings in US News.
NS: This is the same problem we experience in almost everything. And in a way I almost like that flattening. If anything that capitalist realism does, it flattens this kind of spectralization that we think music is this special little thing we can escape, but when we see how it’s turned itself culturally, we see the same old shit. And in that way, I think that flattening is beneficial because it allows us to see things for what they are sometimes and the motivations for certain tactics that are used in publishing and criticism, the line of review rankings or whatever. But my thing is like, there’s so many other models we could be using for how we talk about music. Does it have to be this very obvious one that reflects a very specific set of values? And that’s …Of course the thing is, how would we ever mount that? How could we get people to write on those terms or whatever? Because this is what’s understood. And then we get into a very specific political discussion.
AT: This is the problem with a lot of data visualization itself. Hopefully it’s not the problem with what we’re doing right now. But with the first piece I made for Components, it was like the fetishization of order and numbers. The instant you attach a number to something or the appearance of a quantitative method to something, you put it out of reach not just to most people because they don’t use those methods, but even the people who do understand those methods are still pulled in by the sheer gleam of its presence, the fact that you know where the glow of that light is coming from, you can’t help but feel like warmth of it.
NS: The presence of the number almost makes the language embarrassing, if that makes sense. Say something gets a negative review. Like today, James Blake, singer, producer, one of my favorite artists in like 2010, 2011 because he started this new production style of like spatial dubstep music while also being a singer-songwriter, he’s made this new album, he’s gotten really mainstream, a review got published by one of the few Pitchfork writers I still respect, a guy named Philip Schurburne, he gave the record a 5.8. I’m friends with him on Facebook and he posted this thing was like, “Had a rough day on Twitter today,” because he was just getting flamed by all these fanboys of this artist because he gave the record a 5.8 after Pitchfork has had a huge history of giving his records Best New Music. So basically like, you know he’s a phenomenal writer, but the presence of a 5.8 was so scandalous to James Blake’s fans that then they could look at Philip Schurburne’s writing and just tear the guy apart. The writing becomes this flimsy defense where the number’s presence and it being such a strong authority almost attracts human beings to look at the language and they just start ripping it apart. And the language actually does become ridiculous at that point because you start looking at it and the number is just laughing at the language. The number doesn’t give a fuck about the language and the whole thing just falls apart.
AT: Neither gives a fuck about the other.
NS: It’s almost this weird odd couple. Then you look at the language and memes starts getting made, taking quotes from the writer. If you abstract any sentence from it it just looks ridiculous. But he has a point to make about James Blake’s career and he did a good job about talking about it in some way. And if you were just chatting with the guy about his opinions on music over a beer—he lives in Barcelona, at a bar in Barcelona, I’m sure you would have a much better time talking about the record or actually learn more about his opinions on the record.
AT: But like, you could have a version of that conversation without the number. Remove the number and you’re having that conversation at the bar.
NS: Again, I think it’s the authority that gives it that sensational quality and basically roots the whole thing to be part of this apparatus where you want people to have that reaction because you want to put things nice little places and rank them. And everyone loves to make lists. It’s a huge thing human beings love to do. I love to make lists too. There’s a big difference I think between making a list of 100 to 1 with 1 being the best and 100 being the worst, and having a crazy list of fractals spreading out in all different directions.
AT: The need for taxonomical order is so strong that even when we know the number is bullshit, we can’t help but react to it and be affected by it in some way.
NS: We either celebrate its presence or are oppressed by it. It’s literally a stand-in for power. It ultimately removes nuance from what language and writing about music effectively does, which is a very subtle, intimate reflection someone is having on a listening experience. And the presence of the number is absurd. It is absurd. It’s like…what? It haunts the writing in the most crass way.
AT: The number is necessary for Pitchfork’s existence though, right? Conde Nast owns Pitchfork because Pitchfork publishes a number that people click to see. Remove the number and they would not be a property of Conde Nast. And that kind of goes back to this question of—
NS: Capitalist realism.
AT: Well that certainly, I mean everything comes back to that question. But the question of not having Pitchfork at all with nothing to replace it versus having it as it is. And the capitalist realist reaction to it is like, this is the best you’re going to get actually. And within the confines of the system in which we live where we demand quantification, measurement of everything at all times, Pitchfork is maybe the best version of that you’re going to get. I mean, whole else are you going to get? Fucking Rolling Stone?
NS: Yeah, maybe on a mass scale. And I think this is why we see experimentalism as a minor form. You see these homegrown music writing communities or journals or all types of things that people flock to to deal with this problem.