Installing git on OSX Tiger to work with SVN.
First off, no dmg available for Tiger, so you need to build from source.
Building Git from source is easy on OSX, just download one of the tarballs from linux.org, configure make install, but getting it to work with SVN is not quite so easy because by default it binds to the OSX perl which may or may not have Alien::SVN
MacPorts appears to work: sudo port install git-core +svn
Kernel DB Reverse Engineer with Cayenne
Since I have been playing with Cayenne, I though I would have a look at what it would do with the Sakai Kernel database. This is the limited set of services that represent a minimal kernel, no UI.
After fixing a bug, and getting the database built by starting tomcat, (2h), I popped up Cayenne modeler and reverse engineered the DB. (10s
), Garbage in Garbage out I thought. Well not exactly, the modeler identified a number of tables with no proper PK’s and generated all the classes, and generated a significant number of the relationships between the tables. Its going to be interesting to see if these classes can replace the data storage area. Especially if its going to give us enough control in the AuthZGroups service to get any performance.
Oh, and I now have core SQL DDL scripts for DB2, Derby, Ingres, Postgres, Sybase, SQLServer and naturally HSQLDB, Mysql and Oracle…. all thanks to the cayenne plugin.
On other reason for thinking that this is worth doing is that Cayenne considers the request cycle to be the transaction boundary, and has built in caching. A test query I didn in the OpenSocial mode took 49ms first time round and 3ms second time round, indicating that the caching is working (mysql Query cache disabled to highlight DB load)
Cayenne Plugin
I have been playing with Shindig to create a datamodel to go behind the OpenSocial API. At first I thought it would be simple to use JDBC direct, but it turns out that the model has a reasonable number of dependencies and the entities are quite big. So, I decided to use Cayenne. The model is created and has some simple test cases, and it all appears to work. I did write a maven plugin to generate SQL scripts from the Cayenne model, so its supports about most major databases (about 8). Creating the model from some Java interfaces was quite easy, infact I did most of it by converting the java class into cayenne XML.
The SQL that this generates at runtime, when querying etc appears much cleaner than Hibernate…. from a DBA point of view.
- https://source.sakaiproject.org/contrib//tfd/trunk/social-db/ The OpenSocial API Data layer
- https://source.sakaiproject.org/contrib//tfd/trunk/social-db/src/main/java/org/apache/shindig/social/opensocial/db/SocialMap.map.xml The Cayenne model
- https://source.sakaiproject.org/contrib//tfd/trunk/maven-cayenne-plugin/ A maven 2 Cayenne plugin that generates the SQL scripts for creating the DB
plexus-util build problems
Looks like there are some dependency issues with building maven plugins. Plexus Utils 1.1 appears to be at fault, with messages such as NoSuchRealmException. If you get this, try upgrading the maven api to 2.0.6 or later in your plugin build. Worked for me.
Widget accessibility and Shindig
I have spent most of the Day playing with Shindig, integrating its data-model into Sakai and learning Guice, which is very nice and simple, especially its error feedback, which just seams to get straight to the problem.
While doing this, on and off I have written a page on Widget Accessibility more because I have been struggling to understand what we really need to do, and want someone to tell me what’s wrong. If you are reading this, and you know all about ARIA and how to write accessible web applications, please tell me what not to do…. but please keep it simple so I can understand what you are talking about.
Offline Maven 2
Ever tried to run maven 2 with no network and Snapshots, after midnight…. its hard because it likes to check for updates of snapshots. The following in ~/.m2/settings.xml might help.
<?xml version="1.0"?>
<settings>
<profiles>
<profile>
<repositories>
<repository>
<id>local-repository</id>
<url>file:////Users/ieb/.m2/repository</url>
</repository>
</repositories>
<id>local-offline</id>
</profile>
</profiles>
<activeProfiles>
<activeProfile>local-offline</activeProfile>
</activeProfiles>
</settings>
Javascript Templates
There are 2 main ways of generating markup from within javascript. Either you perform DOM manipulation, as string injection or direct DOM manipulation. Or you can use templates.
For the java developer, DOM manipulation and string injection is rather like hard coding strings into you java code. templates are more like using jsp, velocity or freemarker.
So I started to look at writing a multi file upload that gave progress feedback, and rapidly found that many of the off the shelf components that did this, didn’t expose the markup, so I had to accept the results as they were, or as I could manipulate them with css. If the markup was wrong, I had to dig deep, sometimes very, into the javascript code.
Then I found TrimPath http://code.google.com/p/trimpath/wiki/JavaScriptTemplates , on Google Code. A templating language for Javascript. It looks a bit like Smarty templating from PHP and has some similarities to Velocity. It works by processing a template, either as a string or as a element from the HTML DOM and merging it with a Tree of javascript objects, just like Smarty and Velocity do.
The result is all the markup is now in the html page, and so can be designed and edited. You can see the results by looking at the code at https://saffron.caret.cam.ac.uk/svn/projects/MyCamToolsAlpha/trunk/files/widgets/DropBox/index2.html
This actually runs from SVN, although some versions of FireFox don’t recognize the CSS files as apache isn’t configured quite right…. Oh… and its trunk, so its going to change as I work on it. (no need to deploy any more, just run Sakai direct from SVN
)
Lucene Index Merge and Optimisation
Lucene index merge has some parameters that effect how the index is built. This has an impact on the index operations other than search. The MergeFactor controls how many documents are stored within each segment before a new one is started and how many are started before they are collected into a larger one. So a Factor of 10 means, 10 documents before aggregating and 10 aggregated indexes of a certain size before aggregating again. Consequently MergeFactor controls the number of open files.
The higher the merge factor the faster the index build as merging of segments is less frequent. However this causes a significant slow down in the speed which an index can be added to an existing one as this appears to depend on the number of files lucene has to open.
The next one is the MaxBufferedDocs parameter which controls the number of documents to buffer in memory before flushing to disk. For a batch index operation the higher this is the higher the index performance but the more memory will be consumed.
And then there is a MaxMergeDocs which limits the maximum number of documents within a segment above which merging does not happen. This is used to limit the files size, so that no file is over 2G on a 32bit system.
In running the Sakai search indexer operations I have noticed some things in this area
- Once there are about 50 index directories in a merged index, merging takes 2s per merge. Performing an optimize on the index restores the addDirectory operation to 20ms or less. It makes sense to optimize and index when there are more than 50 directories in the index.
- When performing an merge and optimize of a set of indexes, the optimize step can take a lot of time. (minutes). However I have observed that if the index directories are added to an empty index, in the sequence that they were created, the optimize operation is much faster. This may be because the aggregation steps are simpler. This is only an observation.
Installing Sources.
To be able to install jar sources you can run the mvn source:jar maven command and that will put jar sources into your local repo, so you can use them in eclipse.
Reducing Working Code Size
How many of us load the whole of the Sakai Code base into eclipse, and wonder why it consumes so much memory? Most I guess. Alternatively you can just load the code you are working on and just use the local maven repo for the Sakai jars, that way eclipse will run in considerably less memory. When you need to access the source code, if the repo has the source jars, then they can be used instead of the live code base. Obviously this doesn’t allow you to edit all any code anywhere…. but then should we all be doing that anyway… except for those rare debugging exercises.
I did the above for search, editing the .classpath file for eclipse and now I can just work on search with all the other projects close. Eclipse memory usage has dropped from 1G to closer to 128M. Once we package the core (bin and src) into a maven repo, its going to make sense to use this approach. Fortunately maven has support to help us.