Interest in a partnership with an online tool?

Jan 13, 2015 at 5:15 PM
Hello!

I'm working on an online parse tool, like torparse is (better was) in its best days. I think, I can come up with an early alpha within the next one or two weeks. From my point of view, a partnership would be great. Something like an "upload"-Button in Parsec and of course parsec mentioned and linked on the online parse website.

Before I go into detail, I just want to know, if you would consider a partnership, presupposed you like the service, I'm developing. If so, I'll update this thread as soon as I've something to show.

Thanks
Kinman
Coordinator
Jan 13, 2015 at 7:14 PM
Did you use my parser library to develop from or did you roll your own? My main concern would be that the two parsers result in the same output.

I have had plans for a service for some time and have even developed the UI, logic and encryption to send and receive log files between my service and client. However, I need to find the time to develop the site and also determine if my existing hosting situation can support the size and bandwidth of the service and an online tool. The biggest problem that all the online services run into is funding because it is a lot of data moving back and forth.
Jan 13, 2015 at 8:04 PM
Hi,
no, I'm developing an own parsing logic (better said, it's nearly finished), but it should be very similar. In a few days, I should be able to check, if the results are the same or different. As long as the difference is very small (<0,5%) it shouldn't be really a problem with an little explanation to the end user.

Hosting is getting interesting, if I save all data (estimated 0,5 TB/year with 5 Million log lines a day), but I think about compressing the complete log and save only really needed data for fast access. This should also result in relatively fast access times. My goal is <50GB/year. I just pay for storage. Traffic is fully included.

Greetings,
Kinman
Coordinator
Jan 13, 2015 at 10:43 PM
The main stats I wouldn't be worried about but the stats you have to work around SWTORs log file short comings to get is where we will see differences. For example, SWTOR has two streams of data that write to the log and they are asynchronous and will appear out of order. One stream is activations and events and the other is damage and heals. Many times the damage from things comes late, often after the activation for other things appear. Another common one is the combat end event occurs before the last hit appears almost every time. Another item open to interpretation is the user's use of combat stealth. Obviously this separates a fight in to two but there is logic in parsec to link them. Your logic may be different.

As for data compression and storage. My service was designed to store single fights instead of entire log files and stores them compressed. I found this to be a pretty efficient way to do it.

What programming language are you using?
Jan 14, 2015 at 3:00 AM
DrewCerny wrote:
However, I need to find the time to develop the site and also determine if my existing hosting situation can support the size and bandwidth of the service and an online tool. The biggest problem that all the online services run into is funding because it is a lot of data moving back and forth.
It's not that bad if you compress the data and write your own server for it. Something like Zlib makes compressing a stream of data pretty easy for something like sharing combat log data in real time. Messing around with this recently and it consumes less bandwidth for a 16 player group than Parsecs current implementation, about 200KB out per player for a 3 hour raid with 16 players sharing all data in and out of combat in real time.

This probably wouldn't have to be all done in real time though so you could probably get even more savings out of it.
Jan 14, 2015 at 6:03 AM
Edited Jan 14, 2015 at 6:05 AM
I've read many of your comments at swtor.com and use sometimes the same mechanic to parse the log. e.g. (if you haven't changed it) the 15 secs window after stealth to reenter, one second appended after the ExitCombat-Event and so on. I do this because it's seems clever to me and because our raid is using parsec an it's also a win for us, if the numbers stay the same. Likely I haven't found every trick you do yet. :-)

For the storage it is possible to remove the huge part after the data are created and this would be my last option, if I don't find other ways. Currently I'm considering compressing all texts, before dropping them into the database and optimizing the tables (smaller types and so on). But I've to see, if compressing is a performance issue, even if I do it later on with a cron job when the server load is low.

I'm using PHP5 and MySQL (innoDB).

Greetings,
Kinman
Jan 14, 2015 at 9:52 AM
I would very much advise against storing combat log data in a MySQL database, if you get a lot of users you are going to need a very powerful database server to maintain any kind of decent performance.
Jan 14, 2015 at 10:22 AM
MySQLs performance is not as bad, as long as you don't have complex queries. At the moment I'll stick to it, because I don't have to pay extra. But if performance is going to be a problem, I'll of course consider a change.

At the moment I can analyse and store 2.000 loglines/sec per core on an i7 2600K in debug mode. I'm aiming for 5.000 in production mode on the optimized webserver.

Greetings,
Kinman
Jan 20, 2015 at 10:33 AM
I've just finished the basics of effective healing and noted a difference between PARSEC and my calculations in terms of guarded healers. In my calculations I look for each guard apply and remove effect. It seems that this effect is applied and removed when you enter or leave the 15 or 30m range to the tank (I've to proof which it is). So I've some healings without the guard effect although the healer is guarded, but simply not in range to the tank.
But the threat lowering seems to work even if the healer is out of range. PARSEC does it right. When I manually override all non-guarded healings with guarded healings I get the same EHPS number as in PARSEC.

My question now is: how do you do it? :)
Do you just assume if a healer is guarded at the begin of a combat, that he's guarded trough the combat (exempt he dies) or is there something else in the log, which identifies the threat lowering?

Greetings,
Kinman
Coordinator
Jan 22, 2015 at 5:12 AM
Instead of a Boolean to store guard state I use a counter. I increment and decrement the counter on adds and removes guards. A target is guarded if the counter is greater than 0.
Coordinator
Jan 22, 2015 at 5:31 AM
MorgenBlue wrote:
DrewCerny wrote:
However, I need to find the time to develop the site and also determine if my existing hosting situation can support the size and bandwidth of the service and an online tool. The biggest problem that all the online services run into is funding because it is a lot of data moving back and forth.
It's not that bad if you compress the data and write your own server for it. Something like Zlib makes compressing a stream of data pretty easy for something like sharing combat log data in real time. Messing around with this recently and it consumes less bandwidth for a 16 player group than Parsecs current implementation, about 200KB out per player for a 3 hour raid with 16 players sharing all data in and out of combat in real time.

This probably wouldn't have to be all done in real time though so you could probably get even more savings out of it.
You're suggesting peer to peer, real time combat log sharing?

We are talking about a service that allows fight uploads, online viewing and sharing.

Regardless, I would be interested in seeing how you send compressed raw combat log data for 16 players at less bandwidth than the simplified stats that parsec sends back and forth. Parsec serializes the stats into a simplified string, then uses html compression to transmit to the service and back. It's bandwidth is pretty low per user and it updates every ~5 seconds during combat.

I never went very far in designing a direct communication style client because of the challenges associated with supporting something like that. Parsec is pretty easy to get up and running. Something with direct communications or a server client mode would introduce all kinds of networking problems that I am not prepared to support.
Jan 22, 2015 at 8:53 AM
DrewCerny wrote:
Instead of a Boolean to store guard state I use a counter. I increment and decrement the counter on adds and removes guards. A target is guarded if the counter is greater than 0.
Thank you very much for the hint, now the numbers are nearly the same (< 1% difference).

Greetings Kinman
Jan 23, 2015 at 6:46 AM
Edited Jan 23, 2015 at 6:57 AM
You can go with the P2P approach or a regular client server model. I'm using the regular old client server, you probably need to write your own server though and maintain a constant connection. A lot more complex yes but I think the limitations of the raid service in Parsec is one of its greatest weaknesses. Wouldn't it be great to have all the data in the non-raid tabs available for every group member in your raid group, trigger times from other peoples data ect..?

Each individual combat log record gets serialized -> encoded to base64 ->compressed with zlib -> sent to the stream.
You could probably get even better savings out of it by sending them in batches since you usually will have multiple combat log records every time the combat log files updated while in combat.
Feb 2, 2015 at 5:01 PM
I've uploaded my current version of the tool. If you'd like to have a look:
Heal Log
Damage Log
Log overview

Greetings
Kinman
Coordinator
Feb 3, 2015 at 6:37 PM
MorgenBlue wrote:
You can go with the P2P approach or a regular client server model. I'm using the regular old client server, you probably need to write your own server though and maintain a constant connection. A lot more complex yes but I think the limitations of the raid service in Parsec is one of its greatest weaknesses. Wouldn't it be great to have all the data in the non-raid tabs available for every group member in your raid group, trigger times from other peoples data ect..?

Each individual combat log record gets serialized -> encoded to base64 ->compressed with zlib -> sent to the stream.
You could probably get even better savings out of it by sending them in batches since you usually will have multiple combat log records every time the combat log files updated while in combat.
This is just not feasible in a free application with a centralized server. It is just too much data, even compressed.

In a 3 hr raid I generate a 5mb log file so that is roughly 1,700 bytes per hour.

Assume zlib gets a 60% compression rate: 1020 per hour

If I am in a raid group of 8 people I will send 1020 per hour but I will receive 7140

During peek times the server sees about 3000 connected users at one time which means The server will receive roughly 3 GB and send 21 GB per peek hour.

Peek times last about 3 hours for the US and 3 hours for the EU, and I would estimate that the remaining 18 hours generate as much traffic as 2 more hours of peek time for a total of 8 hours per day peek time.

So here are some totals to host this solution:

Per Day Bandwidth In: 24 GB
Per Day Bandwidth Out: 168 GB
SQL Server or In Memory storage capacity: 21 GB
Peek Bandwidth: 204 MB/sec

Hosting to provide for this solution would cost about $1,500 per month through Amazon EC2.

The only way to provide this type of data to every client would be a peer to peer solution. Then I would be stuck trying to support 15,000 users who have firewall, router or other network problems. No thanks.