One of the realities of running a business is that you will — at some point — need to make money. This means you must sell something for more than it cost to make it.

Profit = Total Revenue - Total Expenses

So, to sell a service or feature (like a third-party API portal) at a profit, we need to understand how much it costs to make it, so we know how much to sell it for. In business parlance, the quantity we are looking for is the cost of goods sold — the accumulated costs used to create a good, including direct labor, materials, and any overhead.

The cost of goods sold (or COGS) is one component that goes in to measuring total expenses, and it is invaluable for financial analysis and planning, allowing us to calculate fundamental company metrics like gross margin.

Gross Margin = (Revenue - COGS)/Revenue

Gross margin represents how much of each dollar sold the company keeps as gross profit. This gross profit can be used to pay off debt, pay administrative expenses, or it can be distributed to shareholders.

Naturally, managers and financial analysts are very interested in these numbers, and we, as engineers, can help them by determining the cost of goods sold of the products we produce. How? Let’s walk through an example, learning how to use some common spreadsheet functions to simplify the task.

A full sample spreadsheet of the results is available as a Google Doc:

https://docs.google.com/spreadsheets/d/1VFGTCtNS4xwTgE7PBLwbEc5a8RfGzLZTg0FXzpTWPU0/edit#gid=0

Calculating Cost of Goods Sold

If you are lucky, an estimate for the cost of the service you are building can be found using Amazon’s Simple Monthly Calculator, or a similar service. If you are not so lucky you will have to roll up your sleeves (and break out your copy of Excel).

In the running example that follows I will show how to gather some basic cost estimates for running Amazon’s API Gateway, using Amazon’s publicly available pricing information. You can use this example to extrapolate estimates for your service.

Gathering Inputs

The first step is to gather your inputs. This means consulting the documentation for any services you are using and grabbing the pricing information. You will need this information available in your spreadsheet for further calculations.

For API Gateway, there are two factors for pricing: API calls and cache usage. Amazon charges $3.50 per million API calls received, plus the cost of data transfer out, in gigabytes, charged according to the following table.

RateData-Transfer-Out
$0.09/GBfor the first 10 TB
$0.085/GBfor the next 40 TB
$0.07/GBfor the next 100 TB
$0.05/GBfor the next 350 TB

You can optionally provision a cache alongside your API at the following rates:

Cache Memory Size (GB)Price per Hour
0.5$0.020
1.6$0.038
6.1$0.200
13.5$0.250
28.4$0.500
58.2$1.000
118.0$1.900
237.0$3.800

Cleaning the Data

Our inputs are in text and table form, and aren’t directly amenable to doing calculations. We need to clean things up a bit.

API call charges of $3.50 per million API calls can be converted to cost per API call.

$3.50 per million = $3.5/1000000 = 0.0000035

Data transfer charges are more interesting. The wording “$0.09/GB for the first 10TB, $0.085/GB for the next 40TB, etc.” actually means you will pay $0.09 for any usage between 0 and 10 TB and pay $0.085 per GB for any usage between 10TB and 50TB, and so on. To convert this into a table format, we need to list each of these cumulative tiers and the price associated with each. The table ends at the 150TB mark because at any usage after that we are paying the $0.05 per GB rate.

Rate (per GB)Data-Transfer-Out (TB)
$0.090
$0.08510
$0.0750
$0.05150

We can simplify this by converting everything to the same unit.

Rate (per GB)Data-Transfer-Out (GB)
$0.090
$0.08510240
$0.0751200
$0.05153,600

Lastly, we have the cache pricing, which is already in a nice format and can be copied directly.

Cache Memory Size (GB)Price per Hour
0.5$0.020
1.6$0.038
6.1$0.200
13.5$0.250
28.4$0.500
58.2$1.000
118.0$1.900
237.0$3.800

Making Assumptions

We have our input data, now we can start making some usage assumptions. API Gateway is priced in a few ways: number of API calls, data transfer out, and cache usage. For example purposes, we can use an estimate of 30 million API calls, an average API payload size of 3KB, and a 6.1GB cache.

Assumptions are hard to make, but the good thing is that if you setup your inputs and calculations correctly, you can modify your assumptions and all of your costs will be updated via the spreadsheet calculations.

  • API calls per month: 30,000,000
  • Average payload size: 3KB
  • Cache size: 6.1GB

Calculating Charges

Given our inputs and assumptions, we can start to derive some cost estimates. This generally involves a bit of multiplication and summation. But there are a few ways to make handling pricing tables and tiered pricing easier using spreadsheet functions. I’ll go into these methods below.

API call charges

API calls are charged at a per call rate. This calculation is fairly direct, simply multiply the number of API calls per month by the rate per API call.

API call charges = API calls per month * rate per call
API call charges = 30,000,000 * $0.0000035
API call charges = $105.00

Data transfer charges (SUMPRODUCT)

Data transfer charges first require knowing how much data we are moving per month. This can be estimated using the number of API calls and the average payload size.

Total size of data transfers = 3 KB * 30 million = 90 million/KB ~= 90 GB

Now we need to calculate how much transfer cost would apply to this value. Data transfer follows a tiered pricing model, which is slightly more difficult to calculate. As an example, for a given usage of 30TB (30720GB), you pay $0.09 for the first 10TB and $0.085 for the rest. In our example usage, we want to calculate

($0.09 * 10TB) + ($0.085 * (30TB - 10TB)) =
($0.09 * 10240GB) + ($0.085 * (30720GB - 10240GB)) =
($0.09 * 10240GB) + ($0.085 * (30720GB - 10240GB)) =
($0.09 * 10240GB) + ($0.085 * 20480GB) =
($0.09 * 10240GB) + ($0.085 * 20480GB) =

You can calculate this in spreadsheet form using the SUMPRODUCT function, which calculates the sum of a product of values in an array. First, take your table of tiered rates, and an input value (30720GB):

Rate (per GB)Data-Transfer-Out (GB)
$0.090
$0.08510240
$0.0751200
$0.05153,600

The SUMPRODUCT function looks like:

=SUMPRODUCT(--(30720>{0,10240,51200,153600}),(30720-{0,10240,51200,153600}),{0.09,-0.005,-0.02,-0.02})

The first parameter in this function will return 1 or 0 if the 30720 is greater than a value in the array (– coerces a boolean to a number). The second parameter will calculate the difference between the input and the current price tier. The last parameter lists the differential price rate, which is the difference between successive price tiers. In effect, what we are doing is calculating the price for the entire amount at the maximum rate, and then subtracting the discount applied at each price tier to get the final result.

For example:

=SUMPRODUCT(--(30720>{0,10240,51200,153600}),(30720-{0,10240,51200,153600}),{0.09,-0.005,-0.02,-0.02})
=SUMPRODUCT(--{TRUE, TRUE, FALSE, FALSE},{30720, 20480, -20480, -122880},{0.09,-0.005,-0.02,-0.02})
=SUMPRODUCT({1, 1, 0, 0},{30720, 20480, -20480, -122880},{0.09,-0.005,-0.02,-0.02})

Now each value is multiplied, and the result summed.

=SUMPRODUCT({30720, 20480, 0, 0},{0.09,-0.005,-0.02,-0.02})
=SUMPRODUCT({2764.80, -102.4, 0, 0})
=2662.4

If we do the same calculation for our assumed usage of 90GB, we get the total of $8.10 (substitute the location of your rate tiers, and differential price table for your spreadsheet into the A1:B4 section).

=SUMPRODUCT(--(90>A1:A4), (90-A1:A4), B1:B4)
=8.1

Cache charges (VLOOKUP)

The last calculation we need to make is cache charges. This is a direct lookup between the desired cache size and the price per hour of that size. The VLOOKUP function can help us here. VLOOKUP finds the closest matching row in a table, and returns the specified column value. In our example, we can specify our desired cache size and use VLOOKUP to find the price per hour for that size. Then just use simple multiplication to determine the cost per month.

For a cache size of 6.1GB and roughly 730 hours in one month, our function would look something like the following (substitute the location of your cache pricing table for your spreadsheet into the A1:B8 section)

=VLOOKUP(6.1, A1:B8,2,TRUE)*730
=$0.20*730
=$146.00

Total cost

The total cost of API Gateway is then the sum of each of these charges. In a true cost of goods calculation you would also include the cost of labor, and some ancillary costs.

Some additional costs not captured in this simplification are the costs for running CloudWatch to monitor APIs, and the cost of a Lambda function for authorizing API calls.

Where to go from here?

Given the pricing information and assumptions, you should now be able to construct a basic cost model for your service. By varying your assumptions you can start to see how different rates of growth of your service will affect the cost to run it, allowing you to plan your costs into the future. This allows you to prepare monthly and yearly estimates and understand the levels of service utilization that would turn a profit, where your break even point is, and generally gauge your return an investment. If you are able to produce a cost model before building out your service, you may decide that the effort isn’t worth the return and decide to build a different service, or to design your service to reduce cost. This is the essence of business planning and financial management.

Appendix

A full sample spreadsheet for using SUMPRODUCT and VLOOKUP is available as a Google Doc:

https://docs.google.com/spreadsheets/d/1VFGTCtNS4xwTgE7PBLwbEc5a8RfGzLZTg0FXzpTWPU0/edit#gid=0

Note: My financial knowledge comes from reading exactly one book as part of a business plan competition. I am a software developer, not an accountant.