Forums › English Language Forums › General › General Discussion

Search

Using Machine Learning to predict the market

5 replies [Last post]
Sun, 07/09/2017 - 18:48
Ailsien

Hi,

~~Google Deepmind data scientist~~ random kid here,

Has anyone tried this yet? This game seems like it would be the perfect sandbox to test out algorithms and try to predict the market. Do people do this in games frequently?

If you were to do this, what sorts of data would you include? I guess it would be kind of hard since we can't scrape the history.

Sun, 07/09/2017 - 19:27
#1
Bopp's picture
Bopp
data then statistics

Yes, you need the data set. As a rule of thumb (not a hard-and-fast rule), large data sets beat small data sets.

Unfortunately, I don't know of any convenient way to scrape these data. Maybe the Steam market provides an API for its data? If so, then it might be a better sandbox than Spiral Knights.

Once you have the data set, you can do all kinds of statistics on it.

Tue, 07/11/2017 - 05:16
#2
Legobuild's picture
Legobuild
Routine

I would actually say you don't necessarily need previous data to estimate the markets here. There are patterns, even if they can't always be exactly predicted, with patience you would get the same results. By this I mean, for example, during flash sales energy is in high demand as people try to get the deals, therefore the price of energy skyrockets. Similarly, sometimes there are auction house featured sales, people need crowns, the energy market tanks. Every so often you also have the occasional player who has bought a lot of energy with money and decided to sell it, this would be the not so expected large swing. There is the one chart a player made a while back which is user driven but tracks the energy market:

https://sites.google.com/site/skcemarket/chart

This tool is only as good as how often players enter the prices. Still overall gives some what trends.

Tue, 07/11/2017 - 16:03
#3
Skepticraven's picture
Skepticraven
↑↑↓↓←→←→ba

What part of the market would you be interested in studying? The energy vs crns trade rate?
I'd say that if you want to sandbox new algorithms, this isn't particularly a good sandbox.
What type of learning - supervised or unsupervised?

You'll have better luck using one of these datasets: https://github.com/caesar0301/awesome-public-datasets

That being said, if you wish to still scrape and process some data related to this game...
I'd recommend the AH.
The ideal data would be to have the bids and sales of all items on the AH.
Some realistic data would be to take snapshots of the contents of the AH and impute the data. If an item disappears before reaching very short time, it sold for the buyout. Items that disappear on very short are sold for their current bid.
It would be interesting to even just run statistics on how much people are willing to pay for items as a time series (along with a hint at the available supply).

Wed, 07/12/2017 - 08:05
#4
Dats-Mah-Boi
@Skeptic I don't think the AH

@Skeptic

I don't think the AH is a good source, because I know many people who put items up and bid on it with their other accounts to drive the final sale price up. You can't tell if an item sold to another player in a legit buy or if the price was inflated by an artificial bid war.

Fri, 07/14/2017 - 04:06
#5
Skepticraven's picture
Skepticraven
↑↑↓↓←→←→ba

@Lethal-Bunduru

Sure, for rare items. On the grand scale (of all items in the AH), I feel like this instance would be a minority. With standard approaches such as word-dictionaries (one-hot vectors) for encoding "which item", handling the variety shouldn't be out of grasp. I envision a handful of possible network approaches for this:
1. Input of item (one-hot vector) + "features" (time of end-sale, ect.), output prediction of sale price (regression).
2. Input of item (one-hot vector) + "features" (time of end-sale, ect.) + bid price + buyout price, output classification of "will item sell" yes/no

In my experiences, the classification problems (cross-entropy error) tend to function much better than regression (mean squared error).

This approach could even be scaled to a subset of items (say materials only) in the AH.

--

You could also say the same about the energy markets. (In terms of price manipulation.)
I bet some people buy out the current supply just to sell everything a bit higher (and put viewable offers to buy to have impatient people out-bid that offer).

--

On a separate note, the wiki subforum has a handful of nice datasets on other items.
Although I've only been using standard statistics to make predictions, here are 3 other possible datasets to use for general ML... (non-market):
1. Treasure Box Drop (~10k samples)
2. Item Forging Bonuses (~4k samples)
3. 4-token Boss Runs (tiny dataset)

Powered by Drupal, an open source content management system