Subject: "mud performance" Previous topic | Next topic
Printer-friendly copy Email this topic to a friend CF Website
Top General Discussions Gameplay Topic #60868
Show all folders

IsildurTue 06-Oct-15 05:29 PM
Member since 04th Mar 2003
5969 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60868, "mud performance"


          

So I noticed the game feeling kind of laggy today despite my ping times looking normal. This made me think maybe the delay is internal to the cf server and not actually a networking issue.

I ran that little "CF ping" script I cooked up a couple weeks ago, which just issues a "nod" command and then times how long it takes before my client sees the "You nod" response. The average was around ~250ms and most of the times were right around there. However, some were considerably shorter (~100ms) and some considerably longer (~500ms).

My round-trip network latency is 35ms, so at a bare minimum my "cf ping" will be at least 35ms.

Internally the game employs some system of pulses so things don't happen "immediately" once a command is received. I get that. My "nod" could arrive just prior to a checkpoint or it could arrive just after a checkpoint, in which case I gather it would have to wait until the next checkpoint before being processed. There's probably also some sort of queuing system, so it's possible it could miss that one as well if there are too many commands queued up.

But when a command like "nod" sometimes takes only ~100ms to get a response but occasionally takes ~500ms...that makes me think something wonky is going on.

Any thoughts? Am I just way off?

btw, please don't take this post as a gripe. That is absolutely not my intent. I'm just trying to help out. If the above is inaccurate to the point of being unhelpful than I apologize in advance for trying to "armchair diagnose".

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

Reply Inconsistent processing times were present in the past, -flso, 08-Oct-15 07:54 PM, #9
Reply RE: Inconsistent processing times were present in the p..., Umiron, 08-Oct-15 08:38 PM, #10
Reply RE: Inconsistent processing times were present in the p..., Isildur, 08-Oct-15 09:26 PM, #11
Reply RE: mud performance, Umiron, 06-Oct-15 06:12 PM, #1
     Reply RE: mud performance, Isildur, 06-Oct-15 08:21 PM, #2
          Reply RE: mud performance, Umiron, 07-Oct-15 08:04 AM, #3
               Reply RE: mud performance, Isildur, 07-Oct-15 08:57 AM, #4
               Reply RE: mud performance, Isildur, 07-Oct-15 01:20 PM, #7
                    Reply RE: mud performance, KaguMaru, 07-Oct-15 02:00 PM, #8
               Reply i've been getting disconnected a lot, Dallevian, 07-Oct-15 09:14 AM, #5
                    Reply RE: i've been getting disconnected a lot, Isildur, 07-Oct-15 11:01 AM, #6

-flsoThu 08-Oct-15 07:30 PM
Member since 02nd Oct 2007
296 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60895, "Inconsistent processing times were present in the past"
In response to Reply #0
Edited on Thu 08-Oct-15 07:54 PM

          

when Zulg was around, right before he left really, I did extensive tests at different times of day with different number of people logged in.

When there were few/no people logged in, CF was *flying*. Movement was ultra-smooth and everything felt very fluid and *consistent*, timing wise.

When 30+ people were logged in, there was obvious jitter in everything but one could immediately see it in movement lag (if was used to the previous condition).

My ping time to CF during those periods was also 100% stable (below 50ms).

I think back then, CF was running on shared hosting (so other #### running on the same machine by user accounts not related to CF may have affected performance), maybe that is no longer the case today so if performance issues remain, they must be caused by CF itself.


The default pulse time for stock ROM is 250ms AFAIK (4 pulses per second).
Assuming CF pulse time is 250ms, if everything is running within spec,
one should never see a command take *more* than 250+network roundtrip to CF
from the time he sends the packet to the time he gets the response.

That number can jitter as long as it stays below 250+roundtrip. If it jitters above,
it indicates that the CF process can't schedule 4 pulses in one second. The higher it jitters (say 1second or more that I've seen in some cases) the more loaded the CF process is and the higher the actual pulse time becomes.

Another way (the smart way really) to time this sort of issue in bulk, without doing the timing from inside a mud client is to simply use tcpdump. You don't want to do the timing inside a MUD client cause you're introducing unwanted timing overhead from the MUD client process, the scripts it's running and whatever scripting language it exposes for you to do the timing in. This overhead could be in the order of tens of milliseconds to more than a couple of hundred which would make the results useless.

First, capture the packets (my example ROM server is running at port 4000 and the data packets that my mud client/server send have the PUSH flag set, so i'm only capturing those):


tcpdump -vvni lo0 -X -s 0 -w rom.pcap "port 4000 and tcp\(tcpflags\) & tcp-push != 0"

\( and \) are square brackets.

Next, type "nod" 10 times say in the mud client.
Finally, stop the packet capture and calculate time deltas between packet pairs (request/response).

tcpdump -r rom.pcap |cut -f 3 -d ':' |cut -f 1 -d ' '| tail -r |gpaste -s -d '-\n' | bc

For my example:

.243956
.227105
.230149
.242648
.217974
.153603
.104957
.120465
.161000
.184953

So we can see here that we never go above 250 ms which is right since i'm running this ROM server locally so network lag is negligible.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

    
UmironThu 08-Oct-15 08:38 PM
Member since 29th May 2017
1497 posts
Click to send email to this author Click to send private message to this author Click to add this author to your buddy list
#60896, "RE: Inconsistent processing times were present in the p..."
In response to Reply #9


          

We already know that the main "game loop" occasionally takes longer than it should because we time it, but we haven't identified why.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

    
IsildurThu 08-Oct-15 09:26 PM
Member since 04th Mar 2003
5969 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60897, "RE: Inconsistent processing times were present in the p..."
In response to Reply #9


          

I would be shocked if the overhead from mud client processing were significant. It doesn't take on the order of 100ms to match a regex.

Re: the delays Umiron mentioned in his reply, I would strongly lean towards something I/O related. Either filesystem or database. But, again, that's just a guess.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

UmironTue 06-Oct-15 06:12 PM
Member since 29th May 2017
1497 posts
Click to send email to this author Click to send private message to this author Click to add this author to your buddy list
#60871, "RE: mud performance"
In response to Reply #0


          

It's certainly possible the MUD Is responsible for some additional lag in addition to any possible network-related latency. It's also possible that other applications on the box are too, either because they're consuming resources CF needs (e.g., disk I/O, etc.) or because CF directly depends on them (e.g,. the database).

Some time ago (before my time), Zulg made a list of major functions within the codebase and added some rudimentary profiling code to them that logs how log they took to execute. Looking at that, it would appear that a handful of them are occasionally over the goals that he originally set for them, though I have no idea what criteria went into setting those original goals. Because of the way CF (and most Diku MUDs) work, that can ultimately lead to a variance in the time (in milliseconds) that a "tick" or other regularly schedule event takes. Having said all that, none of these numbers look seriously out of hand.

Obviously when exactly a "nod" goes through will impact how long it takes for the MUD to respond in that if you happen to send it right before an expensive function call (e.g., a tick, but there are others a mortal wouldn't necessarily notice) then it may take longer, though ideally not so much so that it impacts your game play.

If your script is something I can use in TinTin++ or another CLI client, and you want to share it, I would happily run it on a character connected to CF on localhost and thus effectively ruling out inconsistent network speed.

Unfortunately, I don't really have the ability to profile CF under load or live conditions, but I'll try to do some digging on a test port and see what I see.


  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

    
IsildurTue 06-Oct-15 08:20 PM
Member since 04th Mar 2003
5969 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60877, "RE: mud performance"
In response to Reply #1
Edited on Tue 06-Oct-15 08:21 PM

          

My guess on the slow ones is some blocking I/O that everything else has a dependency on. Probably db related. Total guess, so take with a huge grain of salt.

Here's the script if you want to try it on the server:

http://forums.carrionfields.com/dc/dcboard.php?az=show_topic&forum=6&topic_id=60652&mesg_id=60652&page=2#60657

I wrote it for tintin++ so it should work. Enter "cfping" to start it running and "off" to stop. Right now it just echos each round-trip time but doesn't compute any stats.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

        
UmironWed 07-Oct-15 08:04 AM
Member since 29th May 2017
1497 posts
Click to send email to this author Click to send private message to this author Click to add this author to your buddy list
#60880, "RE: mud performance"
In response to Reply #2


          

The difference between my laptop and the CF server was negligible. I only ran it for a few minutes on each session, but in that time I wasn't able to see the bigger discrepancies you were describing.

I'll dig a little more as I have time.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

            
IsildurWed 07-Oct-15 08:57 AM
Member since 04th Mar 2003
5969 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60881, "RE: mud performance"
In response to Reply #3


          

In the mean time, I'll try to modify the script to compute min/max/mean/median/stdev. Probably just store all the times in a list and compute the stats when it's switched off. With the understanding that if you let it run too long it'll blow up tintin++.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

                
IsildurWed 07-Oct-15 01:16 PM
Member since 04th Mar 2003
5969 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60890, "RE: mud performance"
In response to Reply #4
Edited on Wed 07-Oct-15 01:20 PM

          

Here's a version that doesn't output each time but does output
the min, max, average, median and count when you switch it
off.

#ali {cfping} {#act {You nod.} {#format {endtime} {%U};#math
{duration} {$endtime - $starttime};#math {duration} {$duration
/ 1000};#list {times} {add} {$duration};#format {starttime}
{%U};nod};#ali {off} {#unact {You nod.};#unali {off};#unvar
{starttime};#unvar {endtime};#unvar {duration};#var {min}
{9999};#var {max} {0};#var {avg} {0};#list {times} {size}
{size};#math {halfsize} {$size / 2};#forall {$times[%*]} {#if
{&0 < $min} {#var {min} {&0}};#if {&0 >
$max} {#var {max} {&0}};#math {avg} {$avg + &0}};#list
{times} {sort};#math {avg} {$avg / $size};#math {med}
{$times[$halfsize]};#showme {min: $min ms,  max $max ms,  avg:
$avg ms,  med: $med ms,  count: $size}#unvar {min};#unvar
{max};#unvar {avg};#unvar {med};#unvar {size};#unvar
{halfsize};#unvar {times}};#unvar {times};#format {starttime}
{%U};nod}


Here's my output from just now:

min: 101 ms,  max 1373 ms,  avg: 259 ms,  med: 255 ms,  count:
594

I may go back and add standard deviation but doing math in
tintin++ is kind of a pain.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

                    
KaguMaruWed 07-Oct-15 02:00 PM
Member since 15th Sep 2012
805 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60892, "RE: mud performance"
In response to Reply #7


          

I'm guessing there's no square root function, but variance is
just E(x)^2 - E(x^2) and if you could get E(x) there's no
reason you can't get those two quantities. Standard deviation
is then just the square root of the variance.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

            
DallevianWed 07-Oct-15 09:14 AM
Member since 04th Mar 2003
1639 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60884, "i've been getting disconnected a lot"
In response to Reply #3


          

about 2 to 3 times an hour it will totally disconnect

it doesn't matter if i'm mudding from my mac at home or from wifi on my work computer during lunch, same thing is happening

it isn't impacting other people, just some, from what i've noticed on the 'who' list

could it be from what Isildur is experiencing, high latency at periodic times that makes my session think it lost connection?

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

                
IsildurWed 07-Oct-15 11:01 AM
Member since 04th Mar 2003
5969 posts
Click to send email to this author Click to send private message to this author Click to view this author's profile Click to add this author to your buddy list
#60888, "RE: i've been getting disconnected a lot"
In response to Reply #5


          

If you're getting completely disconnected then it's almost surely a network thing. And if other people *aren't* getting disconnected then it's probably something happening "out there on the internet". That said, it's possible the hosting company is at fault.

  

Alert | IP Printer Friendly copy | Reply | Reply with quote | Top

Top General Discussions Gameplay Topic #60868 Previous topic | Next topic