Wednesday, May 30, 2012

Happy 5th, Firefox And USB Progress

We were working on future budgets and were reminded again about the great savings obtained with running thin clients.  It was just about 5 years ago that our thin clients arrived and they are still working great.  We only have been losing maybe about 5 or so a year due to hardware failure since the 3 year warranty expired.   Right now it appears we can get another 5 years of duty.  $400 / 10 years for around a $40 dollar a year physical hardware desktop cost; and they still are very rarely touched by IT staff.  This architecture really does work and really does save lots of money.

We had a very odd situation where HTML 5 videos were only working for my user account and root.  I knew at that point it must have been some kind of file permissions issue.  After some poking around with strace, I found it in /tmp

drwx------    2 drichard drichard       4096 May 29 15:56 mozilla-media-cache

Out of the box, Firefox does not support multi-users with this cache.  The 'fix' is to do this in the script:

mkdir /tmp/$USER.mozilla 2> /dev/null
export TMPDIR=/tmp/$USER.mozilla

and this gives everyone their own private temporary folder and everything works now as expected.

I'm still working on the new thin client upgrade, and my area of focus right now is making the USB experience better for our users.  We have a situation where some users have full access to the USB sticks, but for many others we just want to give them upload capability only for pictures they have taken with cameras; there is no reason for them to be moving files off the server.   If I open the photo manager automatically, that forces people with full USB access to have another unwanted UI.  So now when users insert a USB device they are given a dialog which describes the two paths they can take and lets them pick (below).

Remember too that the files and USB stick are running locally on the thin client.  The GNOME desktop is not aware of the insert; so there are some nuances in that regard.  As the user performs the various functions, the files are moved to the server via the clipboard (copy/paste) or via FTP file transfer and Nautilus.

So the next improvement area is when to do a sync to the stick.  Those users using the photo manger and clipboard won't need it, but those using drop and drag will.  So I wrote quick little tray/notification applet that when activated does a sync and flushes the files.  When they pick "Full File Manager" they are given a dialog which indicates that the applet is going into the panel.

Nautilus does a great job of hiding the complexities of FTP; and generates nice thumbnails.  When they are finished they right-mouse on the applet and select "Save All Files And Prepare For Eject" and a sync is done.

One of the beautiful things about X/GTK and Linux is the transparency of where the software is running.  GNOME and avant-window-navigator are on the server in our computer room.  The USB notification applet is running on the thin client itself and it properly detects that it should sit into the panel.  If one of our users is performing this USB interaction and running Evolution, LibreOffice and Firefox they very well could be running software from 3 or 4 unique servers along with software running on their workstation and it's completely integrated together.   Next up will be improvements to the simple photo manager and clipboard application for those users that do not have full USB access rights.  This is building on the feedback of those that used the last iteration.

Other Projects: Trying to get some NX issues worked out, installing patches and testing.  LibreOffice 3.5.4 is out, will install that tonight from home and it contains 2 patches that affect us -- very nice to see how fast this software is advancing.

Tuesday, May 22, 2012

Support Portal Updates, Pictures Options And Other Updates

It's already been a busy week, but productive.  I have been hacking on the support portal to add more "actionable" events and cleaning up the UI.  There are still come spacing issues related to names that will be addressed.  But the support group can now see when users are clicking on icons to which they do not have a license.  After the user clicks on it three times in a row we are confident it wasn't accidental and now get a tile stating this fact.  The portal now also displays when users have changed some settings in Evolution that we do not want changed. (email checking less than 10 minute intervals).  We also can now see when a user monitor goes dark or blinks and they power off the thin client and log back in again.  People seem surprised/pleased that we can see these things and that they are contacted almost immediately.

I was doing some UI changes to the picture options UI that appears when you double-click on a photo and that now is live.  Centralized scripts are just a thing of beauty.  I always make the changes in another .py file, test with a few people hard coded and then comment us out and it immediately goes live.  Here is the script that launches the new UI and how the old one was turned off.  Right now I'm sending debugging information to /tmp and after a few days that will be turned off and the commented code removed.  Easy to roll back, easy to put into production, nothing destructive.

Here is the finished UI, similar to what I posted last week with a few small changes:

The reason for the Crop buttons becomes clear when you see the shot below.  Users on dual screens get this when they hit PrintScreen and the aspect ratio is terrible for printing on paper...and very often they want to print what is on 1/2 of the screen:

So they just hit the Crop button and the screen chops in half and is ready to save to their Desktop, print or place into the clipboard:

Another reason for the changes made to the UI is that I wanted to update the UI that appears when users insert USB sticks into their thin clients.  I'll post a more detailed blog about this issue when I am finished.  But in a nutshell all users have access to a simple photo manager that allows them to quickly remove a few photos from a digital camera or USB stick.  Those users with higher access levels can use a full file manager as you would expect from such devices.  I have started to modify this simple UI that trips when the thin clients detect USB insert.  I have built a basic Glade screen fashioned after the one on the main GNOME desktop server; it will mature and look much nicer when finished.  The design goal is to keep those huge megapixel images off the City network that are just bound for LibreOffice and Evolution.  Their higher quality is not needed. As I said, more detailed information will come in a few days.

Other projects: Still working on the new thin client build and making progress; updating LibreOffice to 3.5.3 tonight; very pleased with our drop in Evolution crashes on SLED 11 with the most recent patches--now it's time to figure out some of these deadlocks.

Thursday, May 17, 2012

Project Updates & Why Customize?

The topic of desktop customizations has always come up here at Largo.  I do spend time making changes to how the desktop works, but it's not a significant part of my day.  Most of my job entails debugging software and problems, and helping architect future technology.  Over the years my viewpoint has become that when people say they want "Microsoft Windows" what they really mean for the most part that they want to have a certain set of steps they have memorized on that operating system.  When people have their own computers they spend countless hours tinkering with it and then countless dollars buying software to move documents from one file format to another and hobble together something that works.  We in the computer business have done a horrible job with software design in that regard.  The things that are basic to us:  File system layouts, file formats and file sizes are exactly the thing that users struggle with the most.  I've mentioned before that when I was younger I always thought things would get better when the "computer generation" came into the workplace; but that hasn't happened at all.  I cannot overstate how much time users spend on "files".  In our case we also have the issue of users that log on for a few minutes a day and have never used computers in the past and have no desire to become proficient at them.  If the desktop was not modified there would be some issues that would come to the surface 1) Many many man hours spent looking for files, retyping 'lost' documents, etc  2) User customizations and settings that affect the desktop and cause it to fail.  3) An increasing desire to buy more and more software to 'fix' things that are happening because of skill 4) Failure because it's too hard.

The customizations therefore yield lower support calls and increased efficiencies by eliminating issues of file location, file type and file sizes from their work.  The design tries to create an environment where files in the right format and size are moved automatically between applications.

I posted a Glade screen a few days ago with the revamped Picture UI that comes up when you double-click on a photo or image.  It's moved past the vaporware stage and is now being tested by about 5 of us.  Still some issues to work out, but it's showing promise.  Python makes this all a breeze, but for end users these things are just so confusing.  It's not appropriate to try and get people into GIMP for these basic functions required to interact with software packages; they'll never get it -- and it's just silly to expect it.

The screen appears and it clearly shows the file size and number of pixels; which for most people mean nothing.  However the new feature is to estimate "suitability" for use in Evolution and LibreOffice which are the two primary areas that will receive images.  In the shot below the picture has been opened and it's way too big for document construction and for inserting in an email. 

The ResizeTo option is set to "Medium" and the image reduces and the file size drops substantially.  From here it can be emailed, placed into the clipboard or printed.  I disabled GIMP and EOG when it was reduced to avoid users opening the temporary buffer and making changes and then losing it because they don't save it to the right folder location.  The details of the enabling and disabling of buttons is still working through my head.

 There is a gotcha that I found with placing a picture into the clipboard; it seems like the parent application needs to stay open or the buffer is lost if the image is over a small size.  I tried to it but that doesn't seem to work either.  So for now when they click on [ To Clipboard ] they get a green checkmark indicating that it finished and then an intrusive dialog with instructions on how to continue.  Putting this message into the status line on the bottom would never be seen so I felt this was the best technique in this case.

I expect these changes to be fine tuned and then moved out to users next week.   I'm hopeful this will increase the usability and success of interacting with pictures.  With just a few lines of python, we should see benefit quickly.

Other Projects Updates From This Week:  

LibreOffice is running like a champ and very stable.  We have hardly gotten any support calls and it seems to have slipped right in like a champ and take over for OpenOffice.  I do wish that LibreOffice was hooked into bug-buddy so that I can see how often people are crashing.  We aren't getting calls about crashes, but I like to see them happen and have backtraces.

Novell came through for us and created a big GTK patch for some libraries that were not thread safe interacting with Evolution.  It was a merge of some upstream patches and we loaded them Wednesday.  Previously we would get about 3-5 crashes each day on just this one bug and so far I haven't received one.  As I have mentioned in the past, all backtraces and deadlocks come to me automatically. Very happy to see this improving.

SuseCon will be in Orlando this fall and some of us from Largo will attend. Federico has built a page with an early concept of a site visit to our City so that people can see technology in place.  The link is here.  If you have never seen centralized servers and how software works over remote display it's pretty cool stuff.  It's always nice to show the issues we deal with because I feel they represent issues seen in the enterprise.

Friday, May 11, 2012

Support Portal, Thin Client Updates & UI Updates

Our "support portal" and any thin client updates go hand in hand.  We control features centrally and then they are pushed to thin clients.  My current biggest project now that the GNOME desktop is deployed and LibreOffice is live is merging in all of the features we wanted to deploy in our next thin client update.  I have finished the code to allow Kiosk/POS type connections to Microsoft Windows without requiring a connection first to the City system.  For those at our Recreation sites that move around between workstations, this will be a big help.  I also have added a Help button from the thin clients that allow the users to send us email without first logging in; and they can check on the status of the servers.  We placed our reboot schedules in ICS files for Evolution users, but now the thin clients will be able to display this information as well.  I'll blog more about this with shots when it's all finished.

One feature request that's coming in this next release is the ability to support dual monitors beyond just [ LANDSCAPE ] [ LANDSCAPE].  We'll be adding [ PORTRAIT ] [ PORTRAIT ] and [ PORTRAIT ] [ LANDSCAPE ].  We are all on the same video cards, so once I get it working it deploys to everyone else with the same Xorg files, nice.  The support portal is being modified to understand what combinations work and set the appropriate configuration files for them to download at next OS update. I spent some time hacking on the thin client detail screen which now better understands the monitors, their type and what exactly is displaying on each half.  When requested, the portal breaks their current screen into two pieces and then displays it on the UI on the appropriate monitor.  Mouse over on the monitor displays all supported resolutions, and the new OS build queries their devices and returns the make and model of monitor.  This is going to save us lots of time.

I cleaned up the [ Summary ] tab which displays actionable tiles of user problems.  These are not warnings, but issues where it's mandatory for action to be taken.  I have cleaned up exactly what is displayed and done some alignments of the widgets.  In the shot below it's alerting our support staff that some printers have stuck queues, that some users had some software problems because of missing network permissions and that some users had the power drop on their workstations.  Hovering mouse over these tiles displays the problem along with the most likely way to resolve the issue.

I monitor support calls and have hallway conversations with users all the time.  The intermediate UI that comes up when you double-click on a picture from Nautilus had some issues.  It's always interesting to me to see how users react and use software, and the things that I didn't think of during the design stage.  Very often I have better ideas, but am always trying to work these issues within the hours of the day.  These changes require very few coding changes, it's just mostly about the presentation and UI.  The show below shows the old UI (left) and the new one (right).  The new UI is not yet live and a shell right now in Glade until I feel all the changes look good and then I'll move the code over -- which will be easy.  If you are interested in user interaction, here are the changes:
  • Users seem to like having art on each button, they seem to remember steps by artwork and not the words; each button now has art.
  • The users were not finding the button to PrintToSelectedPrinter easily, so the button was moved to the right of selected the printer.  No call is made to the GNOME printer UI, it has way too many options for this purpose.
  • The users were not seeing easily that you could DeliverWithEvolutionBypass (SMTP dump to Groupwise) based on the entry email box below.  This should be more clear.
  • The 'Size of This File' area while a good start; did not really tell them anything they could understand and use.  The new UI will alert them of "suitability" with Evolution and LibreOffice.  Everyone is shooting 10 megapixel now which is not needed for email and document construction.  I'll make generalized statements (Good, Too Big, Too Small, etc) about the pictures.
  • The functions that allow you to alter a photo are now under the photo; the functions to do something with the resulting output are all on the right side.
  • Users with dual screens were taking screenshots and then wanting to only print 1/2 of the screen.  The current code would print the whole width landscape which is too small.  Buttons are now available to crop the screen.  Going into GIMP and doing this by hand is too many steps for such a basic function.
  • Various alignments and layout techniques improved because I'm slowly learning Glade as time allows.

Once the code is done, I'll connect it for my user account only for testing and then release it to beta testers for wider use and then deploy citywide.  These changes are super easy on a centralized server and are literally just commenting in and out a few lines.   I'll be interested to obtain their feedback.

Thursday, May 03, 2012

LibreOffice Data And Notes, Let The Computer Do The Tedious

We have been live on LibreOffice now for a few days and things are going well.  In conversations with our support group, the biggest issue that people had was "file location".  Most people have no idea where their documents are saved.  If they customized MyDocuments location or lost any RecentDocuments entries they struggle and assume that everything was lost in the upgrade.  File management still continues to be the biggest problem for users, and it's not ever going to change.  I believe that devices like iPhone and iPad succeed because there is no "file system".  Save a photo, and it's available to all applications.  Users don't have to make any choices in that regard; no file names or folders.  That's why the desktop has been customized to allow for as much drop and drag as possible.

Here is a shot of top running with about 100 open LibreOffice instances.  Looks like we could easily get another 100-200 instances running easily, which is wonderful.  Typing is crisp and fast.

While watching the LibreOffice server run and doing some slight tuning, I have been able to hack in some features that I wanted to merge into our "Support Portal" software.  This software is accumulating and monitoring nearly every click and issue on the GNOME desktop and application servers.  One of my pet peeves about software is when there is a tedious task that the computer can and should do for you and the software requires that you do it manually.  Much of the information we are logging is informational, but some of it is "actionable".  These are things that require a fix or step in order for the users issue/request to be resolved.  So I have begun to develop the [ Summary ] tab.  This section monitors all data that is coming in from various software packages and creates an easily seen tile/button of information.  Someone watching the portal is instantly aware of something that they need to do and it's easy for them to find the offending server or application. 

I know there are people out there that create UI fulltime and I'm sure these rough screens are hard to view.  :) But at this point this is more about fleshing out ideas and trying to create something useful for our staff.  Time does not allow for fulltime software engineering, this is usually hacked along with many other projects concurrently.

The screen is broken into 18 tiles and the last 18 events that require our attention appear.  The following items have been marked as "actionable" (more to come):

* RSH failure, whereby user is trying to run software and they don't have the right permissions
* CALENDAR failure, Evolution has a bug where if it crashes it occasionally drops their Groupwise calendar.  The user therefore does not get alarms for meetings because the calendar is disconnected.  We get about 1-2 of these a week
* MEDIA request, where users have asked us to send them our open source DVD which contains the software we run at the City for them to take home for Windows/Mac personal computers.
* FORCEQUIT - Networking, which means they logged back into the server and indicated they dropped off the server.  This usually is a cable or jack problem
* FORCEQUIT - Electrical, which means the power dipped and they were kicked off.  This means that the users UPS is probably dead, or they are not plugged into the battery side.  All of our employees have a UPS; if you have been to Florida in the summer you know why.
* LOAD, one of the servers has gone over 10% CPU usage, this very often means an errant process
* PRINT, there are print jobs that have not flushed from one of the servers within a 15 minute period.  This usually means paper jam, out of toner, etc.  Support can connect to the printers with a browser and debug what's happening.

When you hover your mouse over the button tile, it indicates which tab contains more detailed information and also provides a FIX which normally resolves the issue.  Clicking on the button brings up a user detail screen.

This new area is underdeveloped, but I'm looking forward to continued progress and testing.  I'm also looking forward to NX client for iPad 3 which will allow us to carry this information around at all times.

Tuesday, May 01, 2012

LibreOffice Live, Better Crash Experience

Today was the big day, LibreOffice went live.  We had been testing it for many months, and it fits so well into our architecture that I was confident it would go pretty well; but one never knows.  Last night before I left I reset everyone back to defaults one more time and then when I arrived today I put it live.  The migration was painless for City employees because it didn't cause any disruption.  The next time they requested the word processor, they were pointed to LibreOffice.  Those people that were already in OpenOffice continued in their session until it was closed.  All launch scripts are in common scripts, so I only had to make about a handful of changes and it was done.

The helper applications that I have described in the past seemed to work well in testing and have been deployed.  Notify-send is used in the lower right hand corner to give users tips and FAQs.  The lower left hand corner is a popup button on a timer that allows them to remove deadlocked instances from the process list.  If you leave it alone, it counts down to zero and then just goes away. We have had a few people not read it, and just click on it and then nothing opens because all processes were halted.  Technique will improve quickly as people figure it out.  Non-intrusive, intrusion.  This is required to allow users on 24 hour shifts to clear out processes when IT Support is not available.

With about 100 documents open and about 25 users typing, top looks great.  A few of these processes are still OpenOffice from earlier this morning.  All OpenOffice sessions will flush out as users close documents.

For the number of users on the network and the amount of work being done, things are very stable in regards to teh GNOME desktop.  But as is the case with all software - sometimes things crash.  I kind of had a "duh" moment yesterday when I realized that I could improve their experience.  The GNOME server of course calls bug-buddy when software crashes, by design it gives users a box that contains information that means nothing to them.  A while back I configured bug-buddy on our Evolution server to automatically grab their backtrace and dump to a flat file.  My logs indicate that a few people each day are dumping mail-notification and avant-window-navigator and the server was just giving them the bug-buddy UI and leaving them on their own. Shame on me.  With the panel gone, the session is worthless.  So I moved the bug-buddy binary out of the way and inserted a custom script that logs the crash to our tracking software, and then simply restarts the application for them.  I kill -11'd a session next to me for testing and it's working great.  The user sees a short blink and notify-send of restart and their critical pieces come back.  I'm logging all crashes to see what else might be crashing, but so far it's quiet. I'm thinking this is affecting maybe 5-6 users a day.  If other components are having problems, I should be able to do something similar and just get them up and running again.  The experimental script is below.  I'm pondering the potential for an endless loop and will monitor and adjust as needed.