Background
What I want to achieve
I have some handy lookup tables that I want to find a formula for, so that I can plug in arbitrary numbers between the ones the table lists, and get serviceable results.
What I've got so far
The lookup tables I've got follow the following format:
|ys\xs|-| 1 | 2 | 3 | 4 |...||-----|-|---|---|---|----|---|| 5 |-| 40|150|350| 630|...|| 10 |-| 50|210|460| 820|...|| 15 |-| 60|240|530| 940|...|| 20 |-| 65|260|580|1020|...|| ... |-|...|...|...|....|...|This maps each x and y value pair to a given value. For instance if you look up x = 2 and y = 10 you would get 210. However it is kinda difficult to figure out a decent value if you actually had say x = 3.6 and y = 17.
I've transcribed a couple of the tables to Google Sheets, trying to follow the same matrix/array format (to not make the transition from paper easier to follow):
| | 1 | 2 | 3 | 4 |...|| 5 | 40|150|350| 630|...|| 10 | 50|210|460| 820|...|| 15 | 60|240|530| 940|...|| 20 | 65|260|580|1020|...|| ... |...|...|...|....|...|After a bit of research I think that what I want is to do a multivariate regression/multiple regression, with the xs and ys as independent variables, and whatever is in the matrix is the set of known values for the bound variable.
(At this point I considered a few alternatives. Instead of asking the statistics stack exchange and stack overflow for help with re-invent the statistics wheel at the points I'd inevitably get stuck, or researching what the good people at the Data Science stack exchange claim to be the best python packages to grind through a tiny dataset, I figured that maybe the problem had already been solved in the world of spreadsheets in a simple and accessible way. Which has lead to me posting the question here.)
For doing multivariate regressions in Google Sheets I further found that I could probably use the built in LINEST or LOGEST functions,and I got to the point where I got LOGEST to work by restructuring some of the data by hand to a list like this:
|results|xs|ys||-------|--|--|| 40 | 1| 5|| 50 | 1|10|| 60 | 1|15|| 65 | 1|20|| .. |..|..|| 150 | 2| 5|| 210 | 2|10|| .. |..|..|And then putting it into LOGEST this way: LOGEST(A3:A_n,B3:C_n).
The challenges
Doing so yielded 3 numbers, lets say a, b, and c. But from here the documentation is somewhat terse (in the case of LOGEST, and not mentioned at all for multiple free variables for LINEST), so I'm uncertain whether they are supposed to be used like this ln(result) = x * ln(a) + y * ln(b) + ln(c), or whether x and y should be swapped, or if I should do something completely else with the output.
Another challenge is that reformatting the first array to a table by hand as shown above was a lot of work, and I think there should be a better way to get the results given what I have. But my spreadsheet and searching skills are insufficient to get further at this point.
Finally I also suspect that the data grows exponentially from left to right but follows a logistic distribution from top to bottom, but haven't been able to figure out whether Google Sheets has a regression method to fit that kind of data well at all, or if experimenting with LINEST and LOGEST is the best I can do.
The question
Given a set of data points in an array in Google Sheets and the associated input value pairs for each entry (as shown above), how do I do a regression to get a formula with 2 input variables that decently matches the known points?