Easy 3D

Cool Swing/3D demo! Just checked out a Swing-based Java3D demo from Romain Guy’s Weblog and it looks pretty cool. You see books as 3D images from Amazon, and see them rotate in real time to display the spline as well as the front cover. To run it, you’ll need to have Java3D installed, from the Java 3D Website.

There are a number of interesting demos for Java3D around, but no killer apps that I’ve seen. I haven’t tried coding in it yet. But had a couple of ideas.

One would be to define a set of utility functions in a “scripting” language like BeanShell to make
throwing together Java3D apps easy. It seems like the sort of toolkit where the more experimentation you can do without pulling your hair out over the API, well, the more experimentation you will do. And we need to experiment to see where a 3D interface or 3D-enabled application can be useful.

Then I got an idea for a simple file browser app. The app would show files in a directory, with the standard icons: a folder, a text file icon, a jar icon etc. Click on the icon and it turns out it is a cube, which rotates on each mouse click. Different sides of the cube show different “aspects” of the file: a preview of the contents, a list of file attributes, a little calendar for last modified date, a picture of the author :). It could be easier to quickly look at directory content than the equivalent list of all the file details…who knows.

Or, here’s one: a weather icon, again as a cube. One side shows an (actual) picture or stream of the city you’re looking up; another side shows weather stats; another side shows driving conditions.

Or, movies: side 1 shows the movie as it’s playing, side 2 shows the director and his resume (from IMDB), side 3 shows reviews from rotten tomatoes, side 4 shows how to order it online (in case you rented it).

I know, I’m leaving out two sides of my cube. But then, that’s usually the problem, isn’t it? How do we “navigate” in a 3D environment? The easiest thing I can think of it to “rotate” the cube just by clicking on it, avoiding mouse-gestures or other special tricks entirely. Key commands? Anyone remember I, J, K, M?

Ok, one last one: instant messaging. Keep multiple conversations going on sides of the cube. And have a separate floating cube that rotates to show a pic of the current speaker (most recently received message).

My guess is it’s anyone’s guess whether 3D interfaces will be widely useful for anything in particular we need to do.

(originally posted on this JRoller entry)

Posted in Uncategorized | Comments Off on Easy 3D

Modality

I was thinking about the MVC (Model-View-Architecture) design model while looking at an application with notifications. The application is mainly accessed through a complicated GUI client. Notifications are triggered by data-bound conditions, and on a notification, one either has colors on the screen change, or a popup window opened. But that’s about it. There is no way to configure a notification other than through the dialog, and no other outputs of a notification other than coloring or a popup dialog. What interests me is how one can design for multi-modal input and output, and what advantages this offers.

Let’s say we participate in a discussion group on the web. We normally go to a specific website and check for new postings in the discussion forum. We can also get a daily email of all new postings that day, or an email whenever a new posting is made. We may ask to “monitor” a specific thread for activity–for example, we want to know when someone has responded to a question we’ve posted, receiving an email in that case.

A long time ago, the standard mechanism for tracking and participating in ongoing discussions was the mailing list. One didn’t “monitor” a mailing list (though you could probably write a tool to help you), rather one received emails for each posting as it was submitted (or a daily “digest” of the postings that day). With the advent of the web, the archive of the discussions could be posted on to a web page, with a topic list, archive search, and various thread-navigation tools. Later, “forums” software became available, where one could not only view active postings, but could submit, search, filter, monitor and so on all within a browser.

The trick is that any of these “modalities” for accessing discussions share (or could share) the same underlying processing mechanisms. We have storage of “threads”, “topics”, and “messages”, some “participants” who are active in the discussion, along with email addresses, etc. What differs between the email-only mailing list and the web-based discussion board is the mode in which we participate: email, web-forum, web-based mailing archive, etc. We could also have command-line access, a “rich client” GUI application, integration with our GUI email client, and so on. What stays the same is the underlying features and, possibly, a common “discussion server” that manages them. What differ are the tools that we use to follow and participate in the discussion.

What I suggest is that applications that are built to support multi-modal use are more flexible, and will have a longer lifespan, than those built around a single mode of access. They are more flexible because the users can choose which mode of access is most convenient for them at any given time. They should have a longer lifespan (active use of the software package), because as new modes are “invented”, the new mode can (hopefully) just be strapped on to the existing software package without requiring a complete rewrite.

So, for example, if we build a discussion server, why not allow input through various interfaces (direct email, REST technologies, SOAP, RMI) and output through various channels (email, messaging servers, streams, XML)? We don’t even need an “uber-API” that forces all our users to access over SOAP, for example–what we need are input and output Adaptors that allow different inputs and outputs to work with the same discussion server functionality.

I brought up MVC because when we think about MVC, we often think about writing a new user interface for some model that has not changed; or we think about changing the model without affecting our existing user interfaces. But the concept of a “view” in MVC is not limited to GUIs, of course. Fundamentally we have a separation of our core functionality with our input to, and output from, those functions. If we build to support that separation, we allow our users (and other developers) more room to play and experiment with new models for accessing and contributing to the underlying process.

For example, what I really want are different input modes to look up words on dictionary.com or LEO. I also want their output in XML. If I had both of those, I could easily write a plugin for jEdit, or for Firefox, to look up words, using whatever UI catches my fancy. Currently, I have to adapt on my side to their input format (which is HTTP/forms) and output format (which is HTML). The output format is a particular problem, because if I parse and scrape their HTML for results, any small change in their HTML layout can break all my routines. If I had multiple input formats (for requesting a word-lookup) and output formats (say, HTML, XML, CSV), I could write all sorts of clients to work with the underlying feature (looking up definitions and translations).

Worse, if people start writing HTML screen-scrapers to pull definitions out of a web page returned by dict.leo.org, and if those users rely on that output format, then dict.leo.org will have more trouble upgrading to a newer improved (or just different) layout–because they will break all those screen-scrapers.

In some cases we have a financial problem, which is that there is a non-trivial cost to building and maintaining web-based services, and those websites are often paid for through advertising. And the problem, of course, is that many of these websites I want information from are advertising-supported. If they giving me a way to access the data without requiring me to see click-through ads, then the website no longer automatically gets advertising revenue when I visit the website. If I don’t display the ads in my client, then I am basically getting the information for free. It may be true that information *wants* to be free, but given there is a cost to writing dictionaries and the like, it’s not always true that it will be free. So, for some of these information sources, we get tied in to using their website because the website owners have no other reasonable way to make a living without website-based advertising.

But, I still believe this is a noble goal. The GUI application I was looking at today supports one mechanism for configuring notifications (a dialog), and two ways notifications are delivered (on-screen coloring and popup dialogs). If the notification subsystem was opened up, I could write my own configuration tools for notifications–for example, from the command line, macros, or through email–and I could receive notifications through different channels–say, email, IRC, or instant messaging. And I think that would be a good thing. Of course, there will be limits on the flexibility of both the input and the output, but that’s OK with me. What I really want is just some more flexibility than what I’m offered right now.

So how can we get started? An easy first step would be–for websites that don’t depend on advertising-based revenue–to offer all discussion forums in an XML format using a secondary URL. The other half would be an addition to the HTTP request for navigating the forum to request the output as XML. Then publish the XML schema or DTD. For our own applications, why not throw in command-line access in all cases, if it at all makes sense? And if it makes sense from the command line, how about an Ant task as well?

Last, I know there is another problem, which is that varying input and output mechanisms only make sense if the fundamental semantics of the calls don’t change. If I query a discussion board for a message, I give a message identifier, and get a single message in return. But can I retrieve all messages for a given month? For small data sets, like a log file reader, this might make sense. But most servers would balk if too many users tried to download that data at the same time. This means that given the nature of the data set, there will be limits on how flexible we can make our input or output modes. But we can do better than we are doing now.

(originally posted as this JRoller entry)

Posted in Uncategorized | Tagged , | Comments Off on Modality

Nailgun is Cool

I’ve been checking out Marty Lamb’s Nailgun project, which is a Java server running classes, executed by a small command-line client in C, using a custom communication protocol. The C client, of course, starts up immediately, and the cost of starting the JVM is taken up when the Nailgun server is first launched. This means that classes you execute (called “Nails”) through Nailgun run much faster than if launched directly as they would from the java command-line.

I got to use this just in the last week, when I needed to convert BMP files to JPEG. The document I’m writing has to use JPEGs for images, but the original document had over 100 BMPs. So I had to convert them. The second problem was that I captured new images as screenshots, and, in my version of MS Paint, I can’t save images as JPEGs, so I needed to save them and convert them, too. I need to focus on the document itself, so didn’t want to look for some image-manipulation tool. A quick search found a blog by Chet Haase on image conversion. The utility he wrote, JpegConverter, converts all BMP files in a directory to JPEGs, using Java’s imaging libraries.

Putting two and two together, I installed JpegConverter as a Nail, then wrote a batch script to run the Nailgun client on the utility. So typing

convert

, all BMPs get automagically converted to JPEGs, then copied to my target directory. What’s impressive is how fast it runs. If I were using the java command-line, I’d have to take the hit of launching the JVM each time. And now I don’t.

There’s some other blog I will write with my (unrequested) opinions on Sun’s stewardship of Java, but one point I can make here is it’s too bad this wasn’t available, like, 10 years ago. There is one very small C program to write, which I imagine is pretty portable, and suddenly you can use Java for all sorts of little utilities that wouldn’t be worth it if you had to absorb the cost of the JVM launch. And what is good about this is that you get the memory protection, garbage collection and etc. of Java with the full reach of the Java API.

I’m hoping that Nailgun takes off. I sent Marty some ideas that would be, for me anyway, useful extensions. One is if there were an easy way to add NG-extensions to the Windows (or equivalent *nix) Explorer context menus, then you could run Nails throughout the your desktop GUI. WinZip, for example, has a little context menu that appears on files it controls, so, for example, you can extract a ZIP file to it’s current directory without launching the WinZip console. Why not use Java for similar tasks? There are endless command-line utilities people have already written for Java, it’s just not terribly convenient, or fast, to use them.

My guess is there will always be a range of utilities that are just easier to write in C or a scripting language, but pulling the Java API into the day-to-day world benefits everyone. Because apart from memory management, bounded arrays and all that good stuff, we have the other cool things about the language: dynamic reloading of classes through custom classloaders, multi-threading, security managers, remote invocation, and APIs for everything under the (S)un. And those features in turn mean that one can develop utility software more quickly, and with greater expression, than with at least some alternatives. And the nature of packaging and API interweaving means we can build more powerful utilities on top of simpler utilities, gradually working our way up the food chain.

Another idea is that if the server could be written using the Servlet API, using it’s own protocol (or HTTP as an alternative), then we’d have the full range of web-containers, including their caching, reload, and management functions, to manage Nails. One idea I’ll send to Marty.

And last–why not write adaptors to bring in the full set of JVM-based “scripting” languages? See how close we can get JRuby, Jython, Beanshell and etc. to match their *nix counterparts in common everyday use.

So, check out Nailgun. If you have written Java utilities in the past, try them out as Nails, and send Marty feedback. Maybe we can get a JAR of the most useful ones as a sort of Java performance pack. And maybe Sun will have some interest in backing it as a semi-standard approach to rapid execution of simple utility software in Java. One can only hope.

(originally posted as this JRoller entry)

Posted in Uncategorized | Tagged , , | Comments Off on Nailgun is Cool

Ken Moore’s Law: A Rapidly Increasing Frustration

Rushing Towards Frustration

My friend Ken Moore, a programmer like myself (but with much longer experience) writes to me periodically about his ongoing frustrations with programming. He’s currently a Java programmer, but has also worked with COBOL (I think), C and C++, Perl, etc., as well as some 4GLs. He’s been around, if you know what I mean. What he often writes about is the frustration not just of learning one new API, but working with several new ones in a project. I have to say I agree with him, which is what I want to write about today.

The issue isn’t just the difficulty of learning new APIs, but also in integrating them with others (a combination possibly new to you and to the API). It seems that each additional API we pull into our project not only slows down our development effort incrementally, it increases frustration exponentially. Thus, what I call Ken Moore’s Law: With each different API we integrate into our system, the likelihood of absolute frustration approaches 1.

The Java API Problem

When Java was introduced, the concept of packages and JavaDoc combined to powerful effect. We now had both a unit of distribution (the package) along with package documentation, plus a fussy compiler that made sure we used the new distribution correctly. Packages could be nested, and a top-level package could represent not just a set of classes lumped together, but a whole Application Programming Interface. Asking someone if they knew AWT was asking them if they understood the API defined by the java.awt package. A good distribution would also include a manual and plenty of examples. You could ship around APIs, or point someone to them on the web, and they could add a completely unforseen level of functionality to their application. Later, the arrival of a ClassLoader for jar and ZIP files improved the distribution problem measurably.

Within a short time, the JDK itself had expanded to include two GUI APIs, web page support, XML, XSL and other X-APIs, sound, cryptography and etc. Other development efforts led to the massive Apache Jakarta API set, etc. The support for XML grew to about 6 different, incompatible APIs, and there are as many command-line parameter parsers out there as teas in Long Island.

What was worse, with our goal of supplying the rapidly expanding demands for functionality in our applications, we all turned to learning new APIs. The promise was that we could “easily” add sound, graphs, dynamic interpreted languages, CSS 2 and yadayadayada to our apps in no time at all. Our classpaths grew beyond readability. We added trailers to our cars to carry all the new JavaDoc printouts around, plus all those books from O’Reilly.

But, at the end of the day, what we found was rarely easy times, but rather, a rapidly increasing frustration. Each new API required us to not only learn the interfaces, classes, methods, factories and configuration involved, but also the order of execution for using API components, odd workarounds, configuration of configuration factories for dynamically generated runtime deployment configuration subsystems, and etc. And many times one API (like EJB) required us to integrate another, completely separate one (like JNDI), which itself had a horrible amount to learn. And then there were bugs, inter-version dependencies, and code forks. My hair has grown grey before its time, and people often now ask me, at 36, how old my grandchildren are.

And so?

The worst thing is that I see no immediate relief to this frustration. We can, for example, try to make API installation easier, integrate JavaDocs, perform automated cross-version checking, and so on. Certainly that would help. But we are stuck with, simply put, too much to learn.

In a sense, Java programmers suffer a diet too rich. We have the riches–Java APIs are so varied that you can program almost anything with Java these days, and often have multiple choices for API implementations, because there is an API for it. But the problem is that our poor human brains have a limited capacity (diminuishing over time) for absorbing new information, especially richly structured, hierarchically organized information, etc. And, worse, it would be one thing if every API was well-documented, worked perfectly, and was bug free. The fact that it is not so just adds to the aggravation, more so if we find we just cannot use the Xerces XML 2.0 parser with version 1.5 of our favorite web container.

I think we should look for easier deployment mechanisms, and better documentation systems. We also need more working example code. I think there is also better use to be made of “simplifying” APIs, using a Facade pattern, which reduce the number of steps required to get something useful done.

Beyond that, I’m just expecting a heck of a lot more frustration in my life. It’s the law.

(originally posted as this JRoller entry)

Posted in Uncategorized | Tagged , , | Comments Off on Ken Moore’s Law: A Rapidly Increasing Frustration

Fun with XSL

Working on my Website (Part II)

I started to work on this website again couple of weeks ago. Last year I’d written the content of my website in XML, and generated working XHTML with XSL. All of that was working, but there were problems, namely that I was terrified of changing the XSL once it was working. I mean, there were other issues, such as the inability to nest topics (Home->Writing->Blogs), but the larger issue was that I couldn’t easily change the XSL without breaking something.

Things have improved, though it is far from perfect. I’ve learned more about variables in XSL, as well as imports, and how I can use both to modularize the transformation better.

The cool thing I discovered, sort of to my surprise, is that variables in XSL are more than just placeholders for a single value. Or rather, I can store more than just a string in a variable. I think I knew this, but in coming back to it I’ve realized how cool this is. I can’t explain the technical background very well (because I don’t understand it), but to give an example, I can declare

    <xsl:variable name="key" value="@key" />  
  

which will later let me reference $key containing the attribute value @key for the current node being processed. But it turns out a variable can also hold a node-set, which means the following works as well

    <xsl:variable name="allTopics" value="//topic" />  
  

which means that $allTopics now holds all the

elements on the current document. I can then loop over them using an

  <xsl:for-each select="$allTopics">something in loop</xsl:for-each>
  

I could even restrict the topics to blogs I want to process using another variable

  <xsl:variable name="allBlogTopics" value="$allTopics[@key='blog']" />
  

and, to make it better, I can now pass this variable in as a parameter to a template

  <xsl:call-template name="write-blogs">
  <xsl:with-param name="$allBlogTopics" select="$allBlogTopics"></xsl:with-param>
  </xsl:call-template>
  

and so on and so forth. The ability to capture segments of the document tree in a variable is not only useful, but also a different way of thinking about document processing. XSLT itself serves to convert an input document into an output document. What variables let you do is to ‘snip’ portions of the document around while you are preparing the output. So suddenly I have this sort of clipboard of little document fragments that I can work with. Which is very cool.

Of course, there are problems. Variables in XSL are not typed, and there are no real checks that what you are doing with a variable is meaningful. Or that the variable actually contains anything. If you make a typo on your XPath in the variable declaration, you may get an error (if there is an error in syntax), but you might also just get an empty node-set, meaning that the variable is useless. Now, that might be OK, if your XPath pointed to a fragment that might or might not exist in fact. But if you expect it to be there, you may just find the little typo causes a whole section of output to be skipped.

I assume that if one used an XSL debugger this would become clear. Alas, I am not. The equivalent of a print() statement in XSL is the <xsl:message></xsl:message> command, which is cumbersome to type, not least because to show the value of an element or a variable, you need to nest an <xsl:value-of></xsl:value-of> element within it. Even with the help of my editor, this is cumbersome and ultimately, tiresome. I guess that’s the right word for it: XSL is tiresome to write.

So why use XSL? In my case, as I wrote elsewhere, I wanted to separate the presentation from the content, always a noble and probably worthwhile goal. There are, of course, a number of options, but as my goal was to display a website, and since I had used XML and XSL for this purpose before, it seemed only fitting to apply it here as well. The upside is that, for the most part, the bulk of the presentation code is no longer in the pages themselves, outside of some directives for bold-face, underlining, lists, and the like. The downside is that developing the transformation for the pages in XML takes a long time, and the whole thing feels a little fragile. A small change to the XSL, a little typo, and whole sections may just disappear. And, on the whole, XSL feels mostly natural as a means of transforming XML into XHTML. The XSL actually embeds the output XML within it, so if you back away and squint a little, you get a decent idea of what will come out the other end.

And, truth be told, the fact that XSL was well-defined and has a proper specification is a bonus. There are a number of “templating” languages for web pages, but many were not, as far as I can see, developed by language designers. Which is fine, just that there are more oddities and ambiguities there than I find in XSL.

And, to sum up: XSL is a rather cool way of working with a tree-based structure like XML. Snipping little document portions just feels more natural in XSL than it does in another language with an XML library, like Java. Every language has a feel to it, and, as far as that goes, I think XSL is both interesting and, it it’s own humble fashion, rather lovable.

(originally posted as this JRoller entry)

Posted in Uncategorized | Tagged | Comments Off on Fun with XSL

Writing with Ease

Writing and Markup

What I really want is just to writeup my website without worrying about markup, and to have full control when I finally present it. The website pages are written and stored as XML, before being converted to the XHTML that visitors see.The downside is the amount of typing I’ve forced myself into by using XML.

It would be easier, I guess, if I were using HTML, since there are so many good editors for HTML these days. Most of them give you a WYSIWYG view of the page, so writing feels more of less like using a work processor. For the most part, you just type, and use keyboard shortcuts to markup boldface, italic, underlines, headings, and so on. Occassionally you have to jump to a menu to insert a table or something, but generally you can spend most of your time writing.

Using XML means I can do neat things, like generate different versions of the website, or parse the pages to look for links, inter-page references, make lists of pages, and dadada. I like that aspect of it. What I don’t like so much is having to surround my writing with silly little tags like

  <p></p>
  <b></b>
  <ul><li></li></ul>

  <code></code>

and so on. I use jEdit to write the pages, and that helps a lot (it closes tags automatically, and checks page validity against the DTD), but it still feels like I am constantly aware of markup while typing.

There are solutions, and the current one I’m thinking of involves a mixture of XML and structured text. Structured text means different things in different contexts, but here I’m thinking about Wiki-style structure text (WST). The people who developed the different Wiki tools decided to make it easy to format your Wiki pages, by letting you type what you wanted without (almost) any markup at all, or rather, using WST, which is very very simple. You just type. When you want something in bold, you surround it with *asterisks*, when you want underlines you surround it in _underscores_. These few conventions are so non-intrusive that it feels just like writing an email. And the cool thing is, when you finally see the rendered Wiki page, you have boldface, italic, headings, lists, links and so forth. If I was using WST, then *this* _little piece of text_ would appear to you like this little piece of text.

I like WST a lot. Once you learn the basic rules, writing a page is just like typing in a text editor. There is really nothing that gets in the way. So my new idea is to use XML to structure pages in my website, but once I begin a <section></section>, I just type using WST. When my XSL encounters one of those <section elements, it passes the contents to a Wiki-markup engine like SnipSnap Radeox (Google it), which spits out valid XHTML. I am waiting till the rest of my XSL is stable enough to do this, since I am likely to break things when I try to link in an extension function into XSL.

The question is, how does a Wiki accomplish this stunning feat? Well, WST simplifies the markup problem by saying that there are really just a few types of basic markup that you want: bold, italic, underlines, headings, lists, links–and so on. I think the standard WST includes just over a dozen of these. Then they figured out the least amount of typing required to get those dozen variations, and also followed some established conventions people were already using in plain-text email, like ** or __ to highlight text. They also restricted the variations in layout. In other words, they simplified the markup by simplifying the problem.

Compared to DocBook, LaTEX, or SGML markup, WST can’t really do much for you. Those let you write complete books with as much control you could possibly need. But DocBook, for example, has dozens of tags for formatting documents, in part because “real” writers need so much control over how a “real” book is layed out.

For my purposes in writing a basic website, WST seems like a good solution. Radeox allows you to compose your own definition of a WST, so if I need to I can come up with weird little conventions, so that $% is markup for a filename or some thing. But the more I deviate from the standard, the more I need to remember; WST works because there is little to remember. You know: expand your capabilities by limiting your options.

(originally posted as this JRoller entry)

Posted in Uncategorized | Tagged , | Comments Off on Writing with Ease

Java Memory Management: A Tragedy of the Commons

I was working with NetBeans the other day, debugging some of my
own code. I had been running the debugger repeatedly, tracing through
the code over and again. NetBeans gradually slowed down, and stepping
through lines of code became downright painful. I finally stopped and
restarted it.

(I wrote this last Spring, in March 2004 I think, but never posted it because I hadn’t set up a blog on my site. My current kind and convenient hosts don’t have it set up, I didn’t want to futz so…older stuff goes up for now, till I am caught up. Hence the old NB release referenced here.)

Some time later, maybe the next day, I thought maybe this was due to
a memory limitation–that perhaps NB was running out of memory and the
garbage collector was working overtime trying to free up some small
portion near the top that wasn’t spoken for. Sure enough, some time in
the past I had capped the memory at 96M in my configuration settings.
Increasing it to 128M cleared the problem, at least for the time being.

I’m not writing to bag on NB. It’s a good tool–the 3.6 release is
particularly good–and I use it, as well as jEdit, regularly. What I
started to think about was the problem with memory usage in Java
applications, particularly GUI applications using the Swing toolkit.

The thought I had was that, one the one hand, Java frees us from
worrying about memory management, which is great. On the other hand,
perhaps this apparent freedom leads us to treat memory too
casually–particularly, as an indifferentiated, massive pool from which
we can draw without restraint.

If you are reading this blog, you probably don’t need to be told of
the advantages of memory management in Java. But just to point out–we
have safe access to arrays, protection from invalid pointer references,
automatic memory allocation and deallocation. This is all good. It
seems to save a lot of time. The garbage collectors have gotten much
better over the last few years, and the dreaded, “GC! Stop and put down
your weapons!” pause is less noticeable even in complex GUI apps these
days.

On the other hand, we now create programs within a safe
environment–an environment which, while more simple and less dangerous
than where we used to live, also insulates us from the realities of the
world our programs run in. These realities cannot be done away with
just by locking the door and taping the windows. There are costs to
using memory on most operating systems that most programmers develop
for these days. There is a cost, in time, to allocate memory, track it
and to release it; there are actual limitation to physical memory and
virtual memory on disk is incredibly expensive to use
(performance-wise). But we are isolated from these costs because in the
sandbox where we live, they are invisible. We have to look for them,
test for them, probe for their presence. When they stick their ugly
head up, as happened with me, finding out the root cause of the problem
is almost pointless, given the growing complexity of our applications.
Not impossible, just very time consuming.

What makes this even worse is that in using Java, a great deal of
our power comes not from the expressive power of the language itself,
but from the large, and growing, library of packages we have access to.
This includes not only the large and impressive JDK, but a multitude of
free software, open source and commercial packages we may end up
pulling into our projects. It takes time enough to find and learn how
to use any of these packages. In almost all of those I can think of,
there is nothing in the documentation related to how much memory any
given class will use, and, by extension, no information about how much
a combination of classes in that package will use. I have a 10KB XML
file. How much memory will it take when loaded into an XML DOM? What
about using toolkit X versus toolkit Y? What about if I include or
exclude comments from the XML file? Do I have options for reducing the
footprint at all? In my experience, we just don’t know. At best we
might get a general comment on a toolkit’s readme, something like,
“Memory footprint reduced 10% in this release.”

And yet the tools for figuring this out, for making the problem
visible, are employed after the fact–memory profilers. Some of the new
information available from the VM in JDK 1.5 might make this process a
little easier.

I suggest the root problem is that inherent in the language design
is a message: don’t worry about it. Don’t worry about how memory is
allocated, by which process, using which API call in the O/S. Don’t
worry about how much space a given class will take. Just start coding.
When you need an object, just instantiate the class. Once your program
is written, you can handle major memory problems by just increasing the
size of the heap, or by running a profiler and punishing the worst
offenders.

To make matters worse, in current coding trends programmers often
recommend that we use caching to improve performance across an
application. So not only do we not know how much memory we are
using–we grab it and hold on to it for the long term. I suspect this
is design feature of NB that aggravated the problem. The garbage
collector was doing its job–there just was barely any memory it could
free up, because most of it was spoken for, and would not be released.

So I was thinking of this as a modern version of “The Tragedy of the
Commons”. I won’t repeat that chestnut here. The point is that memory
is a shared resource, in two senses. First, it is shared between
programs running on your PC, and with the OS itself. Second, it is
shared between you (writing your program), other people you are coding
with, and every person who had a hand in all those libraries and
toolkits you are using. All of them are drawing from the same pool of
memory. All of them are acting as if, in general, there was no real
cost to using that memory. And even if they did think about it, we
probably don’t know anyway.

My general thought here is not that there is a fundamental problem
in Java’s memory management model, just that it gives us a false sense
of complete isolation and freedom. We are not completely isolated or
completely free from worrying about memory. It’s similar to what Joel
Spolsky calls “leaky abstractions”. JDBC doesn’t isolate us from
differences in database engines. You could write completely generic SQL
using JDBC (I think) but in the real world you find you can’t–you have
to optimize for access paths to tables, just to take one example, and
that may take advantages of indexing features available on one database
platform but not another. Your code is pretty portable, but will run
differently on another RDBMS, because the JDBC abstraction is “leaky”:
in this case, the underlying RDBMS shows through in how the application
performs.

So I think the danger is this illusion of freedom. What’s not clear
to me is if this is a problem that would be less prevalent if my code
were sprinkled with calls to allocate and deallocate memory (thus
reminding me of what I was using).

I’m not sure, actually, I have few ideas, as I write this, for how
to properly name the problem, much less to suggest an alternative or a
solution. The problems seem to include:

  1. Without extensive runtime profiling, I have no idea how much memory
    my application will require at startup, using various configuration
    parameters, on different JVMs, etc.
  2. Without extensive runtime profiling, I have no idea how much memory
    classes, or combinations of classes, will use. That includes my own
    classes and those in many other packages I reference in my application,
    as well as all the classes referenced indirectly by those packages I
    know nothing about.
  3. I have few options for controlling memory use once I find a
    problem. Repeat after me: “I will not recode javax.swing.text to be
    more memory efficient, I will not recode…” and so on.
  4. If I do accurately profile an application–or, if I do profile a
    single class–I have no idea how much of the memory use is data
    dependent, and how much is data independent.

and so on.

Finally, it’s possible that even if I did have all this information,
it’s mostly pointless. Outside of very memory-constrained applications,
we usually don’t care unless our users are impacted (by being able to
run fewer apps side by side without performance degradation due to GC
and virtual memory swapping), or unless we are in danger of running out
of memory.

(originally posted as this JRoller entry)

Posted in Uncategorized | Tagged , , | Comments Off on Java Memory Management: A Tragedy of the Commons

Dynamic Languages on the JVM

Dynamic Language Integration in the JVM

I’ve been interested to see the different languages built on top of the JVM. If you do a web search for “jvm languages”, you’ll find a whole slew of them listed, some of which are still in active maintenance. One newish one, Groovy, has been submitted as a standard dynamic language for Java. But I think the current approaches to executing these from within a Java program are too clumsy.

There will be some confusion over terminology here. When I refer to “dynamic” languages, I refer to programming languages that a) must or can be interpreted to run and 2) offer a relaxed language syntax over Java. That’s a little bit broad, but some examples will clarify: JavaScript (running in or outside the JVM), Python/Jython, BeanShell, Groovy. They are “dynamic” (as opposed to “static”), in that the compiler doesn’t require you to declare all types before you use them, doesn’t check to see if method invocations will actually invoke anything, etc. If you make a mistake in that regard, it is a runtime, not a compile time, error. I understand there are many subtle variations as regards dynamic versus static (versus latent) typing, and etc. Do some web searching for more detailed discussions.

What I am interested in here is how to use these in the JVM, in particular, how we might speed up development on some types of projects that involve Java. The current problem with Java is that because of static typing and compiler checks, Java development requires more effort up-front (during design and coding), generally slowing down the overall programming project. On the other hand, all these compiler checks reduce some types of errors (which, how many, how useful up for discussion). But you just have to look at the proliferation of dynamic languages, and the vast number of scripts and programs written in them, to see how much people like a relaxed compiler. As far as I can see, they do. A lot.

This blog isn’t about dynamic languages per se, but about current problems integrating them in a project that also uses Java. The question is whether we can’t have a clearer, more directed path towards integration, such that I can start coding and prototyping in a dynamic language, then re-code where necessary in Java, running both alongside each other all the while.

Now, for dynamic languages that can run in the JVM–of which there are many–it turns out we can do this. The problem is that you can’t do it cleanly: you instantiate a sort of Command object, give it some contextual information, then execute the Command (that’s a generalization, but a common approach).

An Example: Scripts in jEdit

For example, the jEdit text editor lets you write scripts in BeanShell, a dynamic language. I have a script, for example, that takes the current buffer in jEdit (say, a text file), parses it, and creates a stub of a Java interface. jEdit invokes the script with some contextual information: a reference to the buffer, the text area, the jEdit instance, etc. These appear as variables within the script–you don’t declare or initialize them, as they are ready to use. So “buffer” and “textarea” and “view” are all variables that I can use in any BeanShell script within jEdit. When jEdit invokes my script, it pushes these references into a sort of “context” object, which it then provides as a parameter to the script. But the invocation of the script is indirect. jEdit has no idea what object my script will return (if any), and doesn’t know what functions are available in my script, what parameters those take, or what the parameter types are. It sets up a call, and invokes it.

So, instead of (within a jEdit code block)

      InterfaceParser parser = new InterfaceParser();
      String interface = parser.parseBuffer(buffer, textarea, view);
      

it would look something like this (this is just a mock-up, not what jEdit really does *laziness*)

      import bsh.Interpreter;
      .
      .
      .
      Interpreter i = new Interpreter(); // Construct an interpreter
      i.set("textarea", textarea); // Set variables
      i.set("buffer", buffer); // Set variables
      i.set("view", view); // Set variables
      
      // Eval the script and get the result
      String script = i.source("interface_parser.bsh");
      

From what I can see, this is a fairly common approach to integrating dynamic languages within Java. The Java compiler checks the definition of Interpreter, but can’t dig down to see what the script itself is doing. If you muck up the parameter types or the return type, runtime exception for you, buddy. Which kind of sucks.

What We Need

What we need, instead, is to be able to treat our dynamic languages as full-class citizens of the Java community. If we can instantiate and invoke dynamic language methods and classes as if they were Java classes and methods, we could start by writing our apps in our favorite dynamic language, then replace parts of it with Java as necessary. This could speed up early development and prototyping, while offering a straighforward migration path to a full-Java application, or at least, the best of both worlds.

An Approach

So, one approach to this is to actually build Java class definitions out of our scripts. Some dynamic languages offer this: you run a pre-compile on your scripts, which outputs Java classes, then you can import, instantiate, extend and otherwise reference them just like Java classes. That is not bad, but it requires, first, a special pre-compile (which adds time if I am editing both the script and the Java), and also, it requires the language to have a clear, non-ambiguous mapping to the Java language, as opposed to a clear, non-ambiguous mapping to JVM bytecode. So, that approach works, and I think the Bistro programming language uses it.

Another approach is to satisfy the compiler. IMO, what the Java compiler wants is to verify the existence and the structure of classes we reference. The class has to be locate-able by a ClassLoader, must have the methods and constructor we are invoking (and they must be accessible), must implement the interfaces we reference, etc. My impression is that, without a complete revision to the Java specification, this should be possible.

My proposal is that when you reference a script, it has an unambiguous name and location so that a special DynamicLangClassLoader (DLC) can find it. There could be special naming conventions, but basically, there would be some mapping between directories and packages, and between a script name and a class name, as with Java files themselves. Our DLC gets is a special type of ClassLoader that our Java compiler uses to test the structural information (class, implements/extends, methods, etc.) of the script we are referencing. The compiler would use a DLC for a script for certain packages. We identify that a particular package is “managed” by our DLC: scripts.patrick.bsh.* would be “managed” by our BSHDynamicLangClassLoader. When a reference is made to a class in a “managed” package, the regular compiler ClassLoader defers to the registerd DLC, and asks it to verify the requested invocation.

Our DLC is thus a sort of Adaptor between two different subsystems–the Java language subsystem and the dynamic language subsystem. A DLC basically checks the script and reports back to the Java compiler whether the script can actually be invoked as the Java program is requesting. The DLC could do this through an interpreted, or maybe could use some reflection on our dynamic language bytecode to verify. The Java compiler shouldn’t care either way.

The only restriction is that we must be able to interpret the script as a non-ambiguous Java class. There has to be a Class definition, function prototypes, etc. Method parameters and return values must be valid Java classes, or must themselves be dynamic language constructs which, at some point, resolve to valid Java classes. Ditto for interfaces and superclasses.

Looking again at the example I gave above,


      import scripts.patrick.bsh.InterfaceParser;
      .
      .
      .
      InterfaceParser parser = new InterfaceParser();
      String interface = parser.parseBuffer(buffer, textarea, view);
      

The scripts.patrick.bsh package would be managed by the BeanShell DLC. It would look in the classpath (for example), for /scripts/patrick/bsh for a file named “interface_parser.bsh”. It would load up that script, and verify that there was a parseBuffer() method with three Java class arguments, for buffer, textarea and view, that returned a String. It would report back to the Java compiler that, yes, these existed. And the compiler would go along its merry way.

And, the DLC could be used by the JVM at runtime to convert a script file to bytecode if it wasn’t already in bytecode.

The Catch

The catch to this is that, first, our DLC must be able to unambiguously satisfy the Java compiler. The script must be executable as if it were a Java language class. A second catch is that, if there were bugs in the interpreter, or if the script was changed between compilation and runtime, we’d get a runtime exception, and probably not a very friendly one at that.

Onwards

I think this is doable within the framework of the Java Language Specification without introducing some nightmarish and long new specification process.

(originally posted as this JRoller entry)

Posted in Uncategorized | Tagged , , | Comments Off on Dynamic Languages on the JVM