Friday, January 27, 2012

Losing Planet GNOME Feed

Looks like I'll be losing my Planet GNOME feed in the next few weeks based on the new policy changes. I'll still be blogging at the same frequency, but you'll have to pick up the feed from Blogspot. I've greatly enjoyed the interaction with all of you and hope that it continues.

Thursday, January 26, 2012

Full Steam Ahead

Now that the tuning issues have been resolved, we have started to move more and more users over to the GNOME server. Just now we hit a new high of 108 concurrent users logged into one server.

And as promised let's see how the server is doing with the 108 users. (shots below). The answer is: excellent. Time to ramp up workstation upgrades and move more people over. There have been a few bumps in completing this project but it's been worth the effort. All City employees will be able to log in and get core software packages (Browser, authentication, word processing, image editing and more) without any license costs.

Tuesday, January 24, 2012

DBus Fixed + Support Portal Changes

After a lengthy process, finally there was a solution in the issue of dbus not allowing more than about 95 users into our new GNOME server. Dbus needed to be patched (thanks Vincent!), and then we continued to hit the limit of 1024 files being opened by the user 'messagebus'. /etc/sysconfig/ulimit has an entry that is supposed to configure the number of files that a user can open at once. It turns out that PAM is performing the same function and was doing the blocking. So I made the change as seen below to /etc/security/limits.conf and it worked. I don't see any other roadblocks and we'll start moving people over to the new server in greater numbers starting next week. Good news!

And here is our support portal showing the 100 concurrent load, woohoo.

We met yesterday and finally made some decisions on the way that NX will be implemented for outside connections. The world has changed greatly in 5 years since the last install, and we are trying to find the right balance between security and functionality. I added a new tab to the portal to show "Logs" and we now can see people logging into the GNOME server is real time. The servers are full of information to alert us that people are having problems or technique issues; so I started to gather all of that information and get it into a new section called "Alerts". We'll be able to see people who tried to launch software that they do not have a license to run, crashes, password problems and problems with settings. This should allow us to be a bit more proactive and do more with less (tm).

Thursday, January 19, 2012

Poor Mans GoogleEarth

In my last blog, I mentioned that I had gone in and made changes to our thin client build to accommodate running NX sessions along with local RDP. We had another thin client project scheduled for 30-45 days in the future and because I was already in the code it was the best choice to just finish it and roll out all features at the same time.

Our county is doing a cool project to gather GPS information on the squad cars from all of the agencies and build a KML file for Google Earth. This will allow the various departments to see if fellow police officers are nearby. My coworkers did all of the front work and got this connectivity working back to the county. Our police department wanted to display this information on a TV in the 911 center. Instead of allocating a full PC with operating system to this function (a waste), we decided to just build a new kiosk mode on our thin clients. So I made the changes to the build. A chooser comes up, you click on a button and GoogleEarth runs on one of our big servers and remote displays back to the thin client. Easy and works great and saves time and money vs installing a thick client for this function.

The next logical step of their project is trying to get this information into the squad cars. There are many scaling and bandwidth problems with trying this. The hardware in rugged PCs is sub-optimal for this function. These devices are run 5 years, so many of them are very old and would come to a crawl with GoogleEarth installed. There also is the issue of sharing bandwidth of this function with the EVDO cards. The networking already is busy doing mission critical functions and cannot be bogged down with another major feature. Another idea would be to run the Google Earth software on a server with their already existing NX infrastructure, but the prospect of 25+ of these running concurrently was not pleasant.

So I had an idea that we are testing now, which was to create a poor mans Google Earth. This was so simple that it took me literally less than 1 hour to write, test and put on the GNOME menu. A cron job runs every five minutes and does a screen grab of the Google Earth session running in the 911 center and dumps it to an image file. I then wrote a quick Glade/Python UI that simply displays this picture and is on a 2 minute timer and refreshes the image. The memory footprint is very light, and everyone is sharing the same image file. So this should scale wonderfully and will barely produce any server load.

Here is the UI. Buttons are nice and big to accommodate their touch screens and allow them to touch with their fingers in a moving car. The image is in a scroll region to ensure it works correctly with their 800x600 and 1024x768 resolutions.

In the shot below, I have marked the memory footprint of Google Earth on the server. 25+ of these would be quite a load in memory and CPU. That's 25+ users all performing the same steps and retrieving the same information independently...not a good design. The other window shows the very simple code used to grab the screenshot from the 911 center. The script pings the device, if it's alive it grabs a shot, creates multiple scales and then moves the finished product into place. The Glade/UI then loads the image into the canvas.

Tuesday, January 17, 2012

Local RDP, NX 3.5 Installed

After waiting many months for the release of NX 4 we reached a point where we could not wait any longer to use compression technology with the new GNOME desktop servers. We were really hoping that NX 4 would fall into place so that we could deploy iPads running host based software. So last week I removed the NX 4 beta and installed the released/stable NX 3.5 After remembering a few tweaks we did a number of years ago, it's all up and running. I removed the NX 4 client from our thin clients and reinstalled the older version as well. A few script changes and improvements and things were working. I should be able to get a test build to remote site users this week.

Our division got word very late in the implementation process of the new MS Windows based Recreation software that thin client workstation IDs were needed for the software to properly know which cash drawer to use. For Recreation sites running XDMCP, this was no problem and already implemented. I had offloaded RDP/Rdesktop sessions many thin client releases ago and they were already running on the local device. But we have one site that was using NX in full screen mode with no window manager. So when a local RDP session was started, there was no window manager to grab it and allow the user to move the two windows around. RDP would just sit over NX and obscure the view. It was clear that a new type of thin client build was required, one that connected to our servers with NX and that was running a local window manager. After a few white board drawings, I picked the path that seemed the best solution and made the necessary changes. As is seen in the shot below, when the user picks this option it automatically detects the size of the screen and NX is started consuming about 90% of the screen. The window manager then wraps this window and they are able to move it around nicely. When they click on this MS Windows app, the server sends a signal to the thin client who then launches Rdeskop locally and the window manager grabs this window as well. I put a small modal applet in the upper left hand corner to allow the user to disconnect this session and shut down the window manager. A bit more testing and this should be ready for beta testing; hopefully later this week.

Friday, January 06, 2012

DBus Root Cause, iPad Testing

In my last blog I mentioned we were having dbus problems. Right around 96-98 users it would stop allowing additional people into the server and dbus would begin chewing lots of CPU. At first I thought it was hitting a limit and denying additional requests. It turns out that it was hitting a 1024 open file limit, which is clearly seen by going into /proc/process_number/fd and watching the number grow. When it hits 1024 the server goes bad. The ever wonderful Vincent Untz found the git entry in dbus that fixed this issue. These patches had not been merged into OpenSuse 11.4. He is going to generate a new RPM for us. I'll test and we'll try and get it into the update channels. The OpenSuse bug is here.

It's been a disappointment that NX 4 hasn't been far enough along to deploy iPads using NX technology. We have been testing other ideas and only found them to be good enough for IT staff, and they are not ready for consumption by regular users. The various techniques under testing:

VNC: Fast over Wifi; doesn't work over EVDO from iPad; resume not working well; no rotary of ports
RDP: We tested RDP/iPad -- Windows/NX Client -- GNOME/Linux; the hop in the middle produces lag, applications like Xournal are too slow.
Citrix: We are going to test Citrix/iPad - -Windows/NX Client -- GNOME/Linux to see how it works. This is not yet working.
NX4/Safari: The alpha release allows you to connect to GNOME with the Safari browser. It's way too slow on a tablet for regular use. We're hopeful a native client will fix this.

Because of increasing requests for this technology to move in beta testing, we are going to move ahead with an approach that is more "client/server". We will continue working on host based solutions as we move into the new year. Document records retention is paramount in Government, we need solutions that will comply with the law.

The tablet solution as it's shaping up now:

For email we have been looking over Groupwise 2012 and it's wonderful on the tablets. Using just Safari, one gets the UI below. Unlike the older Groupwise version, a single touch of the message list instantly opens the message in a preview pane.

The calendar also is designed well for the real estate of at tablet

Dragon Naturally Speaking works well, and allows you to easily email the text generated by your spoken word:

So being that the NX upgrades are not yet ready, how do we allow people to transfer and read their documents on tablets? There are many middle-ware type applications for this function. But we always have to consider cost and more importantly staff size. At a certain point your infrastructure is so complicated that you almost get to the point that no single can take a day of vacation. We found an application for the iPad that seems elegant in simplicity and are going to begin testing. It's called FileApp. This software is very suited for integration into a GNOME (and even mentions it!) desktop, and supports OpenDocument (OpenOffice/LibreOffice) files. It gives you a very simple interface that shows you all documents you have downloaded from your GNOME desktop, and a single tap of the UI starts an FTP server on the tablet. You are then able to gain access to it the desktop and transfer files. Once the connection is closed, your document list refreshes and you can view them "offline".

From FileApp tap the WiFi symbol and the drop down appears. When this is displaying, FTP is running on the tablet. The UI gives the end users their FTP address which is based on their IP address at the time.

From the GNOME desktop, simply type in the FTP address into Nautilus and it displays all of the documents stored on the tablet:

Once the drop window is removed,FTP closes and your document list refreshes and all available documents appear. You can view them by date or file type.

And further firming up the fact that OpenDocument files are not second class citizens, they open directly on the tablet and won't have to be converted to PDF before their viewing.

This software is moving through our Infrastructure group for testing and then will move to the IT Director. Once that's done, we'll move this to a limit number of testers.

Wednesday, January 04, 2012

Calling DBUS Experts....

We're expanding our GNOME desktop project and dbus on OpenSuse 11.4 is kicking our butts.  If anyone with knowledge of this code has any tips, they are gladly accepted as a comment.

What's happening is that right around 96 to 99 users the system dbus seems to no longer accept GDM connections and more users cannot log in.  Very often during this threshold we also see dbus chewing lots of CPU as it seems to be receiving retries over and over again which all fail.  /var/log/messages show gdm crashing and to my eyes it's happening when it's trying to talk to dbus.  I have installed the debug symbols and should get better backtraces starting tomorrow.  The shot below shows the crash 

The documentation is kind of lacking in regards to the tunable parameters.  One doesn't always know the default settings so as to know what should be increased.  It's also not very clear which resource is failing.  I set the following resources this morning and the issue happened again:

The technique that I used was to look at the source code and find where the defaults are set and then try and double them.  There are more parameters, any ideas which ones might help?  It would be wonderful if dbus-monitor showed you these types of failures, but it seems to only show you bus activity which really doesn't help.  As soon as we drop below the 96 users, everything works correctly and users can log on and off with no problems.   This leads me to strongly believe this is a parameter that is being reached. 

Any tips?  Drop a comment.  Thanks!