Wednesday, February 29, 2012

250 Concurrent Users, Tuning and Citrix

Over the last week the server loads have grown and it's interesting to see how well it's running GNOME and all of those sessions. The shot below is how it looks right around 250 concurrent users, and for the most part everything is working well. I've marked some areas in color.

I'm seeing a high level of communication between polkit and dbus (purple); I'm not sure what's happening here but I'll try and sniff it with dbus-monitor and see exactly what's chewing CPU. This activity is not slowing the machine terribly.

Evince-thumbnailer had a feature added that is not in OpenSuse 11.4 that terminates the thumbnailer after about 5 seconds if it cannot finish. On certain PDFs, it appears that it either hangs or that the PDF is so huge it takes more than a few minutes to complete. This is causing a small spike in CPU. As users navigate, these thumbnails get cached so we'll see fewer and fewer of them as time goes by.

Host based Citrix sessions are chewing CPU as I have mentioned previously, the canvas rewrites are expensive.. As marked in red, you can see that they're collectively using a good amount. More information below on this project.



The server has 64G of memory and right about at 250 users it's starting to begin to have to write some cache files to disk and there are sometimes very short pauses a few times a day while this happens. We're going to throw in a few more memory sticks and that should allow us to run closer to 300. This is not a serious problem, and is only happening for a few seconds a few times a day and only under very heavy conditions.

The users have discovered ways to leave "dead" gdm child processes behind that don't seem to halt on their own. I suspect people are doing things like powering off thin clients with sessions running, starting to log in and then turning off thin clients and also allowing the server to sever their connections because they don't log off at night. Our server kicks users off after 13 hours. I'm writing a little script that will run nightly and remove these processes. They aren't affecting speed, but probably are consuming a bit of memory.

As mentioned, I'm making progress with running Citrix locally on the thin clients. When the user picks a MS Windows application that uses Citrix, a signal is passed to the thin client and Citrix is then initiated and forms a connection to the Windows server. Once this is handed off, no additional resources are consumed by the GNOME server and the users have a compressed stream directly to the thin client; no more X11 traffic running over the network. Early testing already indicates this will run faster. I'll mount a section of memory for the Citrix cache so they aren't hammering the flash drives of the thin clients and this should give them optimal speeds.

Other concurrent projects: Connecting LibreOffice to our infrastructure and testing it fully, reviewing Evolution crashes and working with Novell on fixes.

1 comment:

windows server said...

I am a Dell employee and I think your blog is quite informative for the users of windows server. It shows that the blog is very nicely written with analytical facts.