Probably the most well cited (and most controversial) use of film analytics is in the script. There are numerous companies that are known for using analytics at a hefty price to inform the major studios on how they ought to write their scripts. This of course leads into the creativity vs. automation debate and the crux of the controversy.
I enlisted the help of a data scientist friend, Jacob, in writing this post. I fully believe that analytics can make you a better more creative and more profitable filmmaker. And the beauty is that with all the machinery, you, the human, are still at the helm. You still get to decided to ignore the analytics if you choose.
Jacob and I came up with two basic methods that you can use to improve your script: surveys and regressions.
1. Market research (surveying).
Companies like IBM and Structured Data Intelligence (through the Video Genome Project) slice and dice script elements to generate a scale that audiences can use to rate what elements are most important to them. In this way the companies can take a script and find out what parts should be expanded added or focused on and what parts should be cut out. They also find out what kind of market is interested in the elements of the film and who the advertisements should be targeting. This is market research.
The big point of market research is feedback. I love feedback. Why pretend that you know what someone else thinks, wants, or prefers when you can just ask?
Going out and manually interviewing people is an expensive costly process. Plus, I personally hate taking surveys. I think that most people would agree (but I haven’t run the analytics on it). But I don’t mind giving feedback, especially if something isn’t the way I like it. I’ve gone a little overboard offering “constructive” feedback to filmmakers while studying film and acting in college. Those that made changes certainly ended up with a product that I preferred much more.
Since then I learned a lot about the truth of the axioms: “you are not the customer” and “the customer is always right.”
The truth is that I’m not the customer or the audience. No matter how hard I try, my audience always disagrees with me about something or other. Sure, they don’t have the talent or (lets be honest, its more like they don’t have the sheer narcissism and) the willpower to get through the filmmaking process, but what they want still matters.
Why? Because they either promote my film, ignore it, or grind it into the dirt. Because they give me their money and encourage others to do the same or they ask for a refund and “save” their friends and family from me.
As with all things, I am the decision maker and I am the creator. I choose what I want in this film, but I want to align to the preferences of my audience enough so that they will be satisfied. I also want to use my own voice enough that I’ll be satisfied and the audience will be happily surprised and hopefully moved. You see where I’m coming from now, don’t you?
Market research (or audience feedback) has a few steps.
1a. Define your objective.
Chances are that you already have a solid idea for your film. Hopefully, you identified your audience early in the process. If your just getting started and don’t have your idea set in ink yet, I suggest that you use analytics to help you generate the idea for your next film. If you’re already along in the process you can still use analytics to find your target audience.
By defining this audience group you can determine what it is you want your script to do for them. Are you trying to Michael Bay (mindlessly entertain) them? Are you trying to make them laugh? Cry? Maybe you want to build a sense of solidarity and unity between the group and the film? Or perhaps you want to shine a spotlight on the issues that they deal with daily that currently lacks attention?
What ever your objective is, if you choose one that’s for your audience and not (just) for yourself your chances of having the film become both appreciated and profitable go way up.
Defining this objective will help while you write your script and while you figure out what part to keep and what to cut.
1b. Determine your research design.
So, you know that you want to make your target audience laugh. But, how? What tickles the ribs of your target audience?
The purpose of research design is to determine and define who you will be studying, why you will be studying them, where you will be studying them, how you will be studying them, and how you will measure the data you collect. Explorable.com posted a link heavy outline of the very scientific method of research design.
If you’re up to the task you can explore that rabbit hole. Doing so can be very valuable as it might spark methods of gathering data for your objective that you might not have considered otherwise.
If you’re just looking for quick and dirty then I suggest you just do some logical exploration. Where is my audience physically or online? How can I get among them and find out what makes them laugh?
But there’s more to making an audience laugh (or anything) than finding out their propensity for slapstick. What kind of actors do they find funny (and I know that you may not be able to afford Kevin Hart or even Kevin James, but you could find talent that resembles the familiar styles of certain celebrities)? What movies do they rate highly? What director styles do they prefer? What storylines do they love and what ones do they think are weak?
As you consider the many variables that go into writing your script you should be starting to draw some ideas about what questions you want to get answered and how those answers might look. Typically answers come quantitatively (like, “on a scale from 1 to 7…”) or qualitatively (“tell me, what’s the funniest movie you’ve ever seen? What made it funnier than the others?”).
You should formulate whether you want to survey, chat with, or passively observe behavior (or any combination). You will also define the way that you will attract responses. Also, you will want to consider how many respondents you will need to make your findings significant enough to be worth listening to.
1c. Design and Prepare your research study.
The obvious next step is to actually make your study whether its a survey, observation, a quiz, or some other method.
For surveys, Google Docs (I swear, I get no kickbacks from Google, but I should) has a free survey creation tool called Google Forms. Qualtrics is another great survey tool and I’ve heard that students can get a free account, but I don’t know if that holds true for non-students.
These and other DIY (Do-it-yourself) options place you in the same troubling spot you might find yourself in when you complete a film and have no fan base established to sell to. You’re going to have to convince a sample of your target
Two survey products were suggested by my data scientist friend, Jacob. He likes Google Consumer Survey (again, no kickbacks, I promise).
He also likes Survata.
Jacob says, “each of these tools lets you very quickly survey lots of people across the globe. They also let you filter and survey only specific regions or target demographics like male/female, income etc. They also do the data visualization as well.”
Each of these options are cheaper than typical survey options, but if you were doing $300 respondents your costs could be anywhere from $30 to $750. If they can target your demographic, your niche audience, and get a good response then it will be money well spent.
I urge you to consider the end goal before choosing your research method. If you think you’re going to be using a budget of 10s or 100s of thousands of dollars then paying for a survey probably makes sense. But that budget is probably based on the expectation of making 10 times the invested budget.
On the other hand, if your budget is under $10,000 you probably are targeting a very small niche. You’re likely to cap your earning potential at say $100,000. At that point it might be more valuable to target your audience on your own.
Finally, surveys are not the only way. Just getting to know your target audience will give you valuable data to inform your script.
Once you have everything setup it’s time to press the go button. Get out there and collect. And remember to collect from a significantly large and diverse sample.
1e. Analyze your data.
This is the nerdy part.
You want to clean and store your data now. This means that you categorize all the responses, stabilize the quantifiable data, and ensure there are no duplicates. Typically this is done in a spreadsheet.
Realistically, your start figuring out the story that the data tells you. Per our example, what is it that makes this group laugh? What kind of style should this story be told in? What kind of characters or actors should you be considering? What structures? What plotlines? and so on.
1f. Visualize your results.
The final step is to make complete sense out of your data. Move beyond the chats and graphs and figure out the story you’re going to tell. This is the time that you start writing out character cards and story structure. As you sketch out key scenes and outline the story plot you should start to see the movie in your head. Not only is this a good thing in combating writers block, its good because you have a reason to believe that your making something that people really want.
2. Linear regression.
Former statistics professor Vinny Bruzzesse formed Motion Picture Group (he left MPG and started C4 in 2014 which has the same purpose) to analyze the elements in scripts. They run regressions and other statistical analysis on the relationship between movie script elements and financial performance. Then they provide their suggestions to their clients. Those clients choose to adopt or reject the suggestions. By the time the studio makes their choice Bruzzesse has already cashed his check and moved on.
Linear regression is (very simply put) putting a bunch of data onto a plot graph and drawing a line (linear) through them. Your looking for a correlation between two points of data. Obviously, the data is likely going to be historical.
For our use, film analytics, we want to find a correlation between financial performance or audience ratings and elements of the movie. This data can be scraped from sources like IMDB, Box Office Mojo, Rotten Tomatoes, or any of the many other options for movie data available online.
Script elements may take a little more work to gather. The basic method is to get a hold of a script and parse out the text and also categorize the themes, subjects, elements, characters, camera angles and shots, pacing, and plot lines within each script.
So, step one is getting scripts.
There are a lot of writers that take the time to recreate a script from the finished product that they can screen at home on DVD or BluRay. A few services where these scripts can be can be found are Drew’s Script-O-Rama, Go Into the Story, or a few other places.
Step two, now that you have the data you need to narrow down to films that fit more closely what your doing. With our example, we’re making a comedy. It won’t make sense to put WWII dramas into the mix.
The other thing you need to do is get the data out of the scripts. You need to parse out the text and categories the elements. I have before mentioned Amazon’s Mechanical Turk as a potentially cheap service to get the data parsed and categorized by crowds of people for a inexpensive price. There are other services available, like ScriptFAQ to parse and categorize script elements and ScriptThreads (a GitHub program) to parse out the text).
Using these data sets you can find some correlation between profitability or audience ratings and the elements from these scripts which will lead you to determine what elements matter and which ones just don’t.
To be clear, the tighter the plot points cluster around the line the more the element of the script (say a certain actor or a type of humor like slapstick) correlates with the factor you would like to produce from your script (like how much the audience actually likes you movie or how profitable it is). If the plot points look like a big cloud and not so much like a line then there really isn’t any correlation and therefore it doesn’t matter if you use it or not.
Furthermore, assuming that you arrange the chart normally, a line that goes generally from the bottom left to the top right means that more of whatever the element is (Jean Claude Van Damme) the more of the result you hope for ($$$). A line from the top left down to the bottom right indicates that more of whatever the element is (Jean Claude Van Damme) the less of the result you hope for (audience approval).
While this is a simplified version of a linear regression, its good enough to make you dangerous. And dangerous is good. Now you have an idea of what your script might look like, but you want to make sure that it matches up to your specific audience’s needs. The mass appeal metrics will only get you so far.
After you cherry pick a set of the films that you think would help show how close or far the target audiences taste and purchasing preferences are from the masses its time to narrow your research.
To be incredibly narrow with your target audience, you can send out a survey asking which of your cherry picked movies they have paid to see (rent, box office, or own) and how they rate them. This isn’t the only option. You could just chat with them or run a poll, but ultimately you want to collect a set of data that shows how they compare to the data you have currently.
At this point you are simply charting the elements against your target audience’s responses and then you’re off to the races.
My friend Jacob (yeah, the data scientist), offered a third option similar to linear regression: Machine Learning.
Let me quote what he wrote to me:
“‘Categorizing elements’ is a big part of machine learning. These categorized elements in machine learning are called ‘features’. You may also find that there are certain combinations of features that are very hot using machine learning. For example you might find something like: ‘comedy’ + ‘cowboys’ + ‘release date: 2016’ ===> Super popular while ‘comedy’ + ‘cowboys’ + ‘release date: 2014’ ===> mediocre.
“There are hundreds of Machine Learning tools that would let you analyze this data set pretty easily. The most challenging part would be gathering and categorizing data. There’s no way of automatically categorizing the data unless someone has already gone through that manual process. It’s likely that such a dataset exists though I’m not certain about that.”
Whatever method you use, you have accomplished one of the most difficult elements of applying film analytics: identifying what your audience wants.
The thing that makes this so amazing is that you can use what you know or not. You’ve narrowed the universe of options down to something suited for your target audience. You know what they want and what they need. The story is now all up to you.
In case you were hoping for a crash course on actually writing a script, let me refer you to this intro guide to writing a screenplay.
Let me know how it goes.