Wednesday, February 05, 2014

What's happened to Firefox?

Lots happening in Largo lately and I will get to a proper blog update in the next few days.  I have been working heavily on infrastructure changes to accommodate stateless GNOME sessions and BYOD devices.  Good progress, and interesting things to report.  Very busy hours of the day.

So it was an inopportune time that I have had to work a bug report with the current Firefox.  With all of these new versions, it seems like there is a strong culture now to get it out the door regardless of impact to long time users.  It seems like there is a horse race with IE and Chrome to pack in as much stuff as possible and this seems to be at the expense of less understood features that are critical.

We have been using this technology forever, riding the Netscape wave and jumping over to Firefox around the 1.5 era.  Firefox is very fast and stable for us, even over remote X and thin clients.  Everything just works.  It takes me just minutes to do an upgrade and it's something that just churns for hundreds of concurrent users.  Our email is now web based and this is the backbone of the City.  Most new software and all cloud based solutions work with it...that's just awesomeness.

With a constant barrage of security exploits, it's critical that upgrades come in a timely manner.  And then came the problem:  Somewhere around Firefox 24 the whole download infrastructure was rewritten and now all downloaded files no longer honor umask.  It's been a disaster for us when this code was pushed live.  It was patched in Firefox 25 and now is not working again in both Firefox 26 and 27.  This is horrible for Linux and Mac users that want downloaded files that are world readable, they all default to 644 regardless of umask.

Here is a comparison of the older version vs FF 27:

-rw-r--r-- 1 drichard drichard 30169644 2014-02-05 13:33 ffirefox-27.0.tar.bz2
-rw-rw-rw- 1 drichard drichard 30169644 2014-02-05 13:30 firefox-27.0.tar.bz2

Maybe some of the developers have never seen Firefox running in the enterprise, or on multi-user servers, or in a VM or on a Mac with multiple users and don't realize the importance of this working correctly.  Please come and visit us anytime and we'd be happy to demonstrate these types of deployments!

So we are left now on a version two old with no solution in sight.  Do I need to start testing Chrome?

Tuesday, December 31, 2013

Quick Network Change Diagram

I made a quick diagram to illustrate to a few people the new design that we are testing whereby NX technology is used to deploy to our workstations.  I thought it might be a good visual tool to clarify my last blog.  Instead of using the Xserver and Pulse daemon on the thin clients, everything is passed through the backend computer server running NX/GNOME.  This allows for a stateless connection that can be resumed on any hardware that runs the NX client and any browser.

Lots of progress in testing this concept, and I'll blog about it next year.

Happy New Year.

Thursday, December 26, 2013

Thinning Thin Clients, And Other Projects

I have not published an update in a good while, but things have been busy.  Here are the things that I have worked on since the last blog:

Thinning Thin Clients

How the heck can you thin a thin client?  We'll it's my current project and things are progressing fairly well. Since we started using thin clients in the mid-1990s, we have always used remote X as the transport.  It's elegant and fast for our needs, and consumes almost no bandwidth on a modern network.  Our current design has been wonderful for having roaming profiles.  You can log in anywhere in the City, and because everyone is running from a centralized host all of your software and files are immediately available for use.  If you go home and log in using NX, the same is true. 

However this has one design issue, you have to log out of the first location before starting another session.  If users forget to log out from the first location, they  are able to "steal" the session which severs the X connection abruptly and is not ideal when they damage settings or lose some of their work.  In general all of this is working very well and people move around the City all the time.  With the advent of tablets and more mobility, users are wanting to "resume" sessions over different networks of various speeds, including WiFi.  In order to accomplish this goal, we're testing using NX technology for all sessions.  Using NX thins a thin client, because now it's 100% running in the data center and the workstation is used only for mouse and keyboard; the Xserver is no longer remotely running on the thin client.

What this means for end users is that when a second server instance is started with their user account, they'll be able to "resume" the session from one location or device to another.  You can start typing a document at your desk, walk to a meeting and start up a tablet and resume the session and continue on the new footprint.

This change is presenting some scaling issues.  In the past some load was offloaded to the workstations, especially in the case of memory consumption.  Now all of this must be moved into the data center, which means bigger and faster servers with more cores.  We already have money in our budget to replace the GNOME server in 2014, so the timing could not be better.

We have two HP thin client models in use at the City.  The t5725 and t5745, both of which are discontinued.  We ordered the new t610 model which came with Ubuntu 10 installed.  I started the formal process in recent weeks of staging a build for production use.  Customizations were required to order to accommodate the older models.  The 5745 uses the intel driver and the 5725 is running ati.  So the build was modified to detect thin client model at first boot and set up the xorg.conf appropriately.  I was making great progress, when HP released an upgrade for the t610 to Ubuntu 12.  So I created a tarball of the customizations and in a few hours, they were all working on the new operating system.

More and more users are doing Skype interviews and in the past they would just check out a laptop and use that from their desk.  Since Skype is available for Linux, we're testing the concept of adding that feature right on the thin clients.  So far it's working like a champ.

In a nutshell, all three models of the thin clients boot Ubuntu 12 very quickly, and start up a very basic FVWM desktop which offers them the ability to connect to our servers using the NX Client. I am in the process of tuning the build for the three models and I'm working with NoMachine on some issues to make the NX Client work better in our environment.

Here is a shot of the current alpha build. FVWM provides menu and window management for software that connect to the servers.  Rdesktop, Skype and NX Clients run as siblings.  Full GNOME session is running inside the NX Client.


After a quick patch in Firefox 25 to allow files downloaded by users to honor umask, it did not land in Firefox 26 and will return in Firefox 27.  So we're going to skip a version, but the good news is that with Firefox that means only waiting a few weeks.


We jumped on LibreOffice 4.1 at 4.1.1 to solve some issues and improve file filters versus 4.0.  Out of 800 users, about 20 had to be rolled back for various bugs, which is normal and expected.  With release 4.1.4 we have been able to finally get everyone off of 4.0.  Things seem stable with this release too.  The server reports when users kill their software, and I'm not seeing many with LO.  Very cool.  Looking ahead to 4.2, there is a bad bug for us that prohibits us from testing heavily.  It's here.   We have lots of documents that make use of Nimbus Sans and the font currently is not rendering correctly. 

Tuesday, October 29, 2013

First Alfresco Upgrade, LibreOffice Work Around, Firefox 25

We had been testing Alfresco Community 4.2.c for a good while and getting positive feedback from our users.  Version 4.2.e came out and appears to be the version that will release as "4.2".  Naively, I thought there would be a ./install type upgrade with the latest code.  Alfresco upgrades are really more of a dump and reload, which increased the complexity of this greatly.  We decided to allocate some better hardware for this pilot project and for me to take the time to teach myself this process.  In a nutshell, you dump all of the data from Postgres and then it is restored on the new hardware.  You then bring over the alf_data directory and start up the daemons and hope it all works.  After a few test runs, I worked out all of the kinks and it came over and has been made available to end users.  

The LibreOffice 4.1 CMIS connector sadly no longer works with Alfresco.  I brought down LO 4.2 (Alpha) which has been resynced with the latest libcmis changed and it works like a champ, better than ever.  I have "upgraded" 5 of us to this release and we are going to begin testing saving and editing documents directly via LibreOffice -> Alfresco.  Response time and folder refreshes are much faster and the library changes are noticed and appreciated.  Let's see what feedback we get from the end users.

In the shot below, the new icons and art with Alfresco 4.2.e, and LibreOffice 4.2 making direct edits to files stored within the database.

Firefox 25 is out and fixes the issue of not honoring umask from the parent shell.  5 minutes after download, it was installed and live and we were fully patched.  800 users will be very happy about their file permissions.

Current Projects:  I'm looking over Ubuntu 10 loaded on the HP t610 thin clients and checking to see if it can be easily loaded onto 5745 and 5725 model thin clients.  This will continue with our design of having the same operating system on all workstations. I expect it to work, and easily.  Testing NX 4 client and server with the concept of using NX between all workstations and the server. 

Wednesday, October 16, 2013

What's Happening?

Yet again a good number of days have gone by without a blog posting.  Projects and upgrades have consumed most of my time in the last few weeks.

NX/NoMachine 4 Released & Tablets

NX 4 was released.  I had installed and looked over various beta builds as it was being developed, and I put the released version on a VM copy of your primary GNOME desktop server.  The install process is a piece of cake, and you just edit a few .cfg files to point to the right scripts to run at startup and away you go.

One notable feature is the ability to connect with a browser and get a full GNOME desktop and run all of your software without modification.  As seen below, you enter a URL and up came the desktop.   Speed on Firefox 24/Linux is very acceptable.

This browser solution has not proven itself to work well on tablets, even over local WiFi.  But I was happy to see this on the download page:

Tablets have been interesting to me because one school of thought is that people would log in and get a GNOME desktop and work as they do from their desk.  Another thought is that they would work more client/server and download documents and use local software and then upload.  By far users have expressed more interest in the former.  They don't seem to want to use different/local software and want to use what works here at the City.  So the clients for iOS and Android will be greatly appreciated and used.

The biggest issue for us right now with NX 4 is that the clients are not yet ready for deployment in a multi-user environment with "regular users".  NX3 clients allowed you to automate the login process and once they entered their account/password...GNOME came right up.  NX 4 has a lot of prompts and questions which are better suited for power users.  As it stands right now, we would be bombarded with questions and support and there would be a drop in productivity.  If a user is attempting to connect, they have work that must be finished.  If they fail, that work doesn't get done.  There are plans on the roadmap from what I understand to add more of these entperprise features in the future, and we'll have to wait for them before we can deploy.

EDIT: Sarah from Nomachine posted information on this issue on the comments section.  We were unaware and are going to test.

Support Portal Has More Features

Our in-house support portal is still helping us monitor all aspects of a thin client and centralized environment, and some modifications have been made.  It's wonderful that when you have ideas, you can just throw in a few lines of code and it's done.

We're using the issue tracking module now for all calls received and the data is very interesting.  In the shot below you can see that in the last 24 hours our primary support call center got around 20 calls.  This is with many hundreds of users logging in and out of the workstations and using software.  Our current regular concurrent load is right around 300 users.  I have said it many times before, that using thin clients allows your staff to work on maintenance and future projects vs spending lots of resources keeping support intensive PCs running.  I marked the general trends of problems

(RED) Linux/GNOME/Desktop problems   Only 1 all day!
(GREEN) Phones
(PURPLE) Problems with hosted/cloud apps
(BLUE) Problems with MS Windows on various specialized computers
(BLACK) End user questions about working software or procedures.

I also hacked in a quick screen on the user detail area to show documents edited by the user.  Very often people call and have "lost" their document...or sometimes they make edits to a document in /tmp accidentally and then it's lost when this area is purged.  Now at a glance we can assist them without having to look on backup tapes and such.  Very helpful.

LibreOffice 4.1 , Handling Rollbacks

This isn't unique to LibreOffice 4.1, because it's happened going back all the way back to OpenOffice 1.1.  Whenever you get a patch or even more so, whenever you get a major release there are always issues where documents don't work correctly.   I keep all of the old versions on the server and can always test and find the regression.  Standard procedure then is to rollback a user until we get a patch and then upgrade them again to the latest release.  This was done by me, with hard coded scripts.  I put this into the support portal and now anyone in IT can either roll their own account back or rollback another user to solve these types of problems.  All of this activity is logged, and we're nagged once a day of people on prior versions so that we continue to try and resolve their issue.

We have three bad issues with LO 4.1 right now, affecting about a dozen people.  1) There is a bug with building graphs from spreadsheet cells 2) There is a bug for selecting sections of spreadsheets for printing 3) There is a bug with drop down cells in Calc.

These users are on 4.0 and everyone is happy and working.

Firefox 24 Bug And Quick Response

 Firefox 23 had an issue with saving files, the files were saved as 600 on Linux and no longer honored umask.  When then happened was that work flow for many people was interrupted because as they shared files with others, they could not be opened.  In general users do not understand how to use file permissions nor how to change them.  The Firefox developers came through in a pinch and a patch is already landing in time for version 25.  Thanks guys.

Alfresco Testing Continues.

We have three non IT people starting to test Alfresco 4.2 as a proof of concept.  Deployment is not imminent at this point because we don't have the staffing; but it's gaining some steam and people are starting to see the power of working in this manner.   The testers are putting PDF content into the Alfresco database and then seeing that it's immediately available on their tablets.  We're meeting today again to discuss how to connect with LibreOffice, and will talk about the concept of using hybrid PDF as our official City file format.  Right now LO is a bit clunky with this format, but we see possibilities I can start creating feature requests and possibly funding improvements.

Alfresco supports RSS feeds, so I have been building a proof of concept systray application which would alert users of changes to documents and work flow that need their attention.  In the shot below, the icon is sitting in the tray and clicking on the icon shows the last document updates.  At our meeting today we'll talk about this idea and whether we want to test notification popups too.  The strength of open source:  being able to rapidly build prototypes and then deploy solutions quickly with little to no cost.

SuseCon Orlando

It's only a few weeks away, and I'll be attending along with my coworker.  Find us and say Hi!

Thursday, August 01, 2013

HP t610 Thin Client R&D Project

I haven't talked about desktop hardware for a good while.  Years ago we bought several hundred HP t5725 workstations.  They're still production and working great, and are projected to run their full 10 year duty cycle.  HP replaced the t5725 with the t5745 model which offered better performance and 1Gb networking.  We have purchased around 50 of the t5745s and given them to users that need faster performance; such as those that use Google Earth and large PDF files.  They too should provide a 10 year duty cycle.

In recent months the t5745 was discontinued and replaced with the t610.  Specs are here.   In short, it's an AMD Dual-Core T56N APU with Radeon HD 6320 Graphics 1.65 GHz running the Ubuntu operating system.  We purchased one device for around $400 dollars.  My first inclination of course was to open it up and look under the hood.  Pictures and comments are below.

In the coming weeks, I'm going to install our custom modifications into the base OS and get it working the same as the earlier models.

I've spoken about thin clients many times, but wanted to mention again how incredibly cost effective and stable they are to run for enterprise use.  A small City of 10 employees would not reap benefits, but as you get into the hundreds of users, the savings are clear.  We have about 560 of them deployed around various City buildings.  $400 purchase price with a 10 year duty cycle yields $40 per year (hardware only) desktop costs.  When one of them dies, the users have nothing saved locally and our support group just walks down and replaces the hardware.  The user is back up again in just a few minutes with no loss of productivity.

Here is the inside of the t610, the heat from the CPU is moved away with copper and vents.  No moving parts, no fans.  The blue module in the upper left hand corner is the flash "hard drive".  These can be 1GB or 2GB and are cheap and easily upgraded.

Lots of ports on the front, nice design.

More ports and some legacy connections on the back.  All of our hardware should connect and work.

The entire case has no screws and pops apart with just clips. Here it is back together.

Tuesday, July 30, 2013

Lots Of Updates

I just noticed again that it's been a while since my last blog.  Things have been busy at always, some of my projects lately were not so conductive to screenshots and technical thoughts. But a few things might be of interest.

SuseCon 2013

First up I got permission and funding to attend SuseCon in Orlando this fall.  It's always nice to meet people and have hallway conversations.  Find me and say Hi if you'd like to converse.  The conference last year was very productive!

Alfresco Testing & Analysis

I have been working with the IT Director on testing and exploring ways to deploy Alfresco, and content management in general.  Historically we have always had silos of data that were not highly shared, and it's our hope that we can move in another direction with this software.  We have been exploring ideas on the best way to build the top level Sites and then how content should be stored.  If several departments are working on a City project, our information should be shared without duplication of effort and documents.  We're making some headway on ideas and believe that we have built the best possible top level structure.  We are also considering the staffing requirements of this type of deployment and impact to other City departments.  Lots of testing and R&D ahead on this project.  The software itself and mobile devices are working great.

Support Portal, Additional Logging

We have not been able to add additional staff for a while, so I have been trying to get more and more data that is logged to flat files on our servers into the hands of our support staff.  This allows all of IT to see issues that normally would have meant checking the various files manually and understanding how to scan through logs.   The screens are rough, but working and are allowing us to be very proactive on seeing problems.  The blue area shows users logging in and out of the servers.  The purple area is showing us users that have logged in the most often in the last few days.  Usually they are not having problems, but sometimes this indicates someone that is struggling with errant hardware or network connections.  Remember, that very often end users will never call for help and will continue to struggle.  Seeing this information allows us to call them and offer help.  The dark green section is showing users that logged in multiple times without first logging off.  This sometimes means they have simply powered off their workstations, or are losing power and have a bad UPS.  The red section shows users logging into NX from the Internet.  The light green section shows users that had authentication failures with Zimbra.  The black section will show users that entered the wrong passwords into the server and those users that entered the wrong passwords into their screensavers.

Support Portal, Creating Issues

The portal is aware of a lot of issues and sees problems on the network.  So it was a very simple step to just create a basic issue tracking module.  In the various tabs of monitoring, there is a button marked [ Create Issue ] that takes the highlighted item and automatically generates an issue and allows us to add notes.  These screens are so simple to create with Glade and python, that even as it stands now it's increasing our ability to assist users.  This development is not a fulltime project, so I'm sure the UI will not be scrutinized.  :)

The shot below shows issues that were generated automatically from the various log files.  The portal knows who, where and what -- We just needed a way to track status and add notes.  The red section shows a summary of the open issues. The black section is a very simple plugin system that allows for summary of the data.

And the issue detail screen allows for very basic tracking and note taking.

Looking For Monitor Problems

We have been seeing a few issues a day where users are having problems with monitor resolutions and some issues where software seems to put the Xserver out of sync with the monitor.  The first step was to write a little ksh script that fires at login, and basically compares their Xorg file against what xrandr -q reports.  In many cases X just kind of fixes itself and works, but in some cases the users will have problems.  So we're slowly cleaning up these issues and getting everything configured correctly.  Each little 1% improvement that you get counts and reduces user frustration.  In the issue detail above, the portal detected that a user had dual monitors that were not configured in the right way; one supported 1920x1200/1920x1080 and the other only supported 1920x1080.  Without this centralized software and a server based solution, this would be very time consuming to troubleshoot.

Other projects:   Testing NX 4 preview, prepping for Zimbra patches and upgrades, testing LibreOffice 4.1 and prepping for deployment, continued patches and testing for Firefox, Flash and Java,.