Fred Mikkelsen's Blog
Fred Mikkelsen's Homepage
Fred Mikkelsen has seven years experience in telecom OSS systems integration and was the co-author of an early CORBA ORB. He holds a BSCS and had graduate studies in Combinatorics.
Top K Queries ... some ideas
Posted by fmikkels on March 21, 2008 at 11:21 AM | Permalink
| Comments (0)
Balancing maximal performance with the real-world issues of limited resources in computing are the driving thoughts behind this blog. For Top-K Queries, what can you do to restrict the possiblity that a query will consume more resources than you can reasonably allocate for a service call? If a query needs to be interrupted, how can you return partial results that are pretty good?
Continue Reading...
SaaS! Virtualization! Surveys?
Posted by fmikkels on February 20, 2008 at 11:18 AM | Permalink
| Comments (0)
From 1 to 10, how are you computing today?
I have always had an issue with surveys. News media loves surveys. "8 out of 10 surveyed say they 'strongly disagree' with their car being towed, yet 9 out of 10 'strongly agree' that the city council should do something about cars parked at expired meters. How can we reconcile this polarization in America? ... news at 10"
I find it interesting that Sociology Students could get Doctoral degrees by asking people how they felt about something, and publishing the results of what they said provided you did it in the presence of statistics. As I recall in Mathematics to get a Ph.D., you had to prove a theory, not just run some numbers.
With the rise of relational databases, the Internet, even adopting Voice over IP (VoIP), I have seen the same cycle being repeated. CFOs, CEOs, and CTOs are quoted in analyst surveys outlining the relative level of activity in planning in these areas. Depending on the study, 80% may be 'considering', while 40% are 'actively pursuing' and 20% have 'implemented' some form of the technology. Now, we see the same surveys being implemented with SaaS and Virtualization. To this I borrow the catch phrase for a caffeine-laced ginseng beverage ...
WAKE UP PEOPLE !!!
These surveys aren't random. The CEOs aren't asked questions like, "Does your company seek to incorporate an animal mascot in its advertising?" Or, "Are you planning to re-diversify into manufacturing buggy whips?"
The thing that I liked about the Sciences, (and I'll put Computer Science in as a science), is that there is a right answer, or at least a small number of right answers or options. If your sorting data and memory is tight, a heap sort is good, if the data overwhelms memory, then a hash-bucket key merge-sort is good. In highly parallel actions, believe it or not a multi-threaded bubble sort may work best. There is an answer. This is when I realized:
The surveys that are being made are not generally mere curiosities of the analyst firms. The people who put these together have an idea of how the industry should proceed.
They analyze. Analysts are thinking when they first wake up, and right before bedtime, and sometimes even during the middle of the workday how companies would work better. A full-time job figuring out what is working and what will be working for people.
A story ...
Ten years ago, I was selling BPM solutions. I was the Systems Engineer. It was a relatively new technology. The V.P. of Sales was on the call. The CIO, the big cheese for the company's technology direction made a comment. "Nothing like BPM has ever been necessary to run our business." The V.P. replied, "That's because you're not innovative."
Long, pregnant pause. Not quite nine months, but very long. You could hear a pin drop. The Sales Manager who worked for the V.P. of Sales nearly drooled because his jaw dropped so far. All that work, and his boss offended the customer, and would be a long, quiet flight home. You could hear each person's heartbeat around the table. All the employees were looking at each other pale and sweaty not knowing what to do or say.
But the CIO took it in stride. "You're right. We're not innovative. In five years, everyone in this industry will be using BPM."
I'm pretty sure if this CIO had been surveyed, he would have been one of the lowest on the adoption curve. But, when presented with the reality of what can be done, he recognized that he needed to move.
Virtualization
If you're not thinking about virtualization, why aren't you? Why do you want to run your businesses server farm with 25% to 75% more hardware than necessary? There may be good reasons, but understand them.
If you're a maker of software products, why are you not enabling your customers to get into a virtualized environment? Will the customer appreciate this decision? Your largest customers will Virtualize soon. Do you want them to move to a competitors product that for enhanced virtualization?
Supporting virtualization is in most cases a testing step. BEA provides additional virtualization capabilities and performance with WLS-VE (WebLogic Server - Virtual Edition) and other VE-enabled products. Even if you're not working with WLS-VE, you should test in a virtualized environment.
SaaS
The Software as a Service trend is picking up steam. There are a couple primary directions this is being initiated from ... the front-end, and the back-end. On the front-end, AJAX and smart client web applications are getting very good. The minority of computer users install their mail readers anymore. In the business world, especially if you consider BlackBerry devices and technologies like Microsoft Office Web Access, email is virtualized. Thin UIs are sufficiently good.
On the back-end, virtualization and consolidated hosting is making SaaS a reasonable paradigm to organize software. You can adjust the resources on an application basis. With a proper billing system, you can apply your best technical resources to maximize business value with configuration options. Add a billing system, ad revenue, usage models and fees, and you have a new venue.
If I ran a company, I would want all business services to be provided through SaaS. Why? Disaster recovery, ramping and scaling, field offices, moving expenses, and overall flexibility. In an office building in Newton, Mass several years ago, a small company was "wiped out" because the tenant on the floor above left the window open. The heating pipes froze and when the heating turned on, water gushed down to the company below. Operations crawled for a few weeks as they worked through insurance, bought new computers, on-and-on. Another CEO once told me that a move was one-third as expensive as a fire. His company had had three of each.
The odds of a disaster occurring for some or all of the operations of a company seem to be about 25% over the course of a decade. Factor in severe storms that do not actually cause a "disaster" to you but threaten a disaster, and it's closer to 100%.
Version 1.0 2008-02-20 2:02PM
Between iPhone and Air ! ...
Posted by fmikkels on January 25, 2008 at 12:00 AM | Permalink
| Comments (2)
The iPhone, the iPod, the Apple Air ... convergence from three sides to a common place?
Apple Air ... Where else have we heard "Air" recently? From Adobe. Their line of software to allow high-quality thin-client (e-hem) applications. In July 2007, I commented on the Apple iPhone as a next generation personal computation and communication device. Now, the Apple Air is making the reverse convergent evolution.
What's in a Name ? ... remember Mackintosh Labs? Now remember Air?
Few may remember Mackintosh Stereos. Do you? They were a hardware device manufacturer of audio stereo equipment in the 1970s. A good portion of their income may have come from licensing the name "Macintosh" to Apple computers, which, years ago, Apple still gave credit to. Now, Apple now makes audio equipment in the form of the iPod family.
Where else have we heard "Air" recently? (hmmm) From Adobe. Their line of software to allow high-quality thin-client (e-hem) applications. Could the culture of Apple be to drop 10-year breadcrumbs? Daughter "Lisa" computer "Lisa". Mackintosh Stereos to Macintosh Music Players. Adobe Air to Apple Air to ...
I see evidence that the convergent track is continuing. Somewhere between the iPhone and Air is the future. An all-in-one, thin client personal computation, communication, and entertainment device.
The iPhone
I don't have to tell you what the iPhone is--but let's look at the ads. The early television ad in May 2007 ran something like this: "Not the mobile internet, not a version of the internet, not something that's just like the internet. It's the Internet on your phone." As the deployment date came closer in July 2007, the ads focused on the coolness of the device as a consumer of video, music downloads, email, and again lastly, "as a phone." The impulse buyer probably is swayed with the message "you were the last to get the cool iPod; don't make that mistake again!". The campaign sold 500,000 units in less than a week has to be considered exceptionally well executed.
Being more technically focused, I'm focusing on the earlier message about the Internet. The message left out that the iPhone is an OS-X based computer with a 3.5" color touch panel and WiFi. It's a Mac in iPod's clothing. Add a little enterprise software, perhaps a SIP server and apply some proven algorithms. Wow!
Here's what I was thinking about the iPhone in July, 2007:
All internet enabled applications will be accessible through the iPhone. It's the internet, after all. If you've standardized on Web 2.0 in your enterprise, then iPhone can be your mobile office. Add to this the capabilities of back-office imaging software. The 2 mega-pixel resolution is not a serious limitation because software can uprez two or three images to aggregate a virtual 4 megapixel image. Now you have faxing, OCR, and all single sheet document processing capabilities. An iPhone could be a platform for a point-of-sale device.
Now, the Air ...Convergent Evolution
In January 2008, the Apple Air was announced. It takes the Mac and diminishes its size to the thickness of the iPhone. (hmmm) The overall dimensions of the width and height are similar to a regular Mac laptop, but there is evolution.
It's tricky being a hardware manufacturer evolving towards a vision. You make money by selling hardware that is continually replaced by the next great thing. Small devices fetch smaller money than big devices, Manufacturing costs are highly related to the number of compents and the weight of those components. Margins are higher on small devices. People will pay $500 for something the size of an iPhone, but will pay $2500 for a laptop. And, if you can get them to buy both, you get $3200 (including $200 for connective cables and equipment).
Once, a small business person bought an individual fax machine ($300), a copier ($500), scanner ($200), and printer ($150), for a total of $1200, you will have just one mobile device for your pocket for the cost of a phone. And, just as the printer vendors make money off the ink cartridges, the unified pocket device will make money off phone service, content downloads, and applications.
Hmmm ... "Air" .... where have I heard that name before?
Rev 3 : 2008-03-26 00:00:03
Rev 2 : 2008-01-25 00:00:02
Rev 1 : 2007-07-21 04:34:19
Web 2.0, SaaS and Portals
Posted by fmikkels on January 19, 2008 at 7:18 PM | Permalink
| Comments (1)
SOA, Portlet Servers, Web 2.0, AJAX, SaaS, and Social Computing, represent different views of a complete re-examination of application delivery. The sum represents the general vision of:
"With the least amount of investment, I want to maximize existing capabilities."
SOA, the Services Oriented Architecture, applies modular programming techniques to the back-end of enterprise application architectures. For the early adopters, they specifically want to reuse existing application logic--leveraging what is already there.
Services like credit verification, order entry; catalog query, provisioning, and job status can be collected in a repository and used as needed. The web service infrastructure is specifically designed to allow directory access, authentication, and governance. Control, security, statistics and planning can be made based on a component level.
Portal and Portlet Servers provide a means through techniques such as WSRP to provide functional front-ends to the SOA-enabled services. If you are presenting a means to interact with a service, you will want a collection of user interfaces ready to go. Having portlets for SOA-enabled resources assures quicker adoption to use it (quicker ROI).
Web 2.0 describes an environment in which significant applications are delivered through the internet, and AJAX, is a means to organize software client code in an web browser. Using readily available JavaScript instead of more deployment sensitive Java applets or Flash help assure the universal deployment provided by the web is maintained.
One means to organize code in Web 2.0 is in a Mash-up. This is where two different services are brought together to provide higher visibility.
House prices + maps = A new Real Estate application.
Song popularity ratings + song download = on-line radio station.
If you have a domain expertise, you can focus on making a high-powered domain-specific set of services and portlets, and use mashups and web authoring tools as your means of application deployment.
SaaS (Software as a Service) describes the economic model where you can make money selling Browser-enabled services over the internet. Where SOA describes the technical organization of services, SaaS describes the outward-facing presence. Facing the public, SaaS requires greater level of governance and control. The directory and governance services to manage SOA will be expanded to assist with fraud protection, virus detection, and billing.
These techniques, SOA, Portlets, Web 2.0, AJAX, and SaaS encompass the means to deliver any enterprise-quality application over the internet. The motivations to do so involve disaster recovery, simplified development, and much easier administration. In short, lower risk and cost.
Social Computing removes an additional layer from the overall application development cost, and provides a means to leverage your domain expertise without needing to "reinvent the wheel" in each new application.
All organizations have existing collection of knowledge ... procedures and processes to complete the task, or escalate a case. Knowledge of web sites to perform shipping, or provide maps. Pointers to other applications that are related to the application you're in.
Social Computing provides the means to capture the organizational knowledge and make it part of your application. Through usage, the most relevant, most useful tools as selected by the people who are most like you become available.
By using the patterns of the user's applications, you can maintain and improve services, identify new mashup opportunities, and improve the overall application experience. For the ISV, they can adapt quickly to their customers changing and requirements. For the enterprise, they can maximize their application investment and leverage their employees' knowledge.
"How do I achieve this migration?"
Your organization is more ready for SOA modularization than you probably realize. Using Web Service abstractions, you can front-end your database and applications to expose the functionality you deserve. For the green-screen applications, this type of interfacing was called "Screen Scraping". As bemoaned as this approach was, it was shown to not only work, but to also perform reasonably well. It was branded non-elegant, but it worked. -- especially when you consider that it was the integration of minimal intensity. All technologies since the 1970s have been far more flexible than screen scraping. Socket protocols, middleware, database APIs, Java RMI, HTML/HTTP, RPC, Web Services, ... these are all better technologies than screen scraping.
How to Proceed towards Web 2.0 and SaaS
Regardless of the way to get your services exposed, do so. "Screen scraping", the most brute force of service presentation techniques seemed doomed to fail, but it has proven to be good enough. Since the 1980s, integration technologies have been developed to make integration even better. It is unlikely that any integration solution you choose will not be sufficiently good enough to get started.
There are two primary interfaces to make: One is a programmatic web services interface, and the other is a portlet interface. Try to reflect the web services interface through the portlet interface. The portlet will be adopted quicker and this will enforce testability.
And employ a governance solution for development time governance and runtime governance so you can accurately track costs, interactions, and user patterns. From this beachhead of tangible knowledge of your integration usage, better plans can be made in preparation for future projects.
rev 1.1 - 1/20/2008
ALER brings REAL value
Posted by fmikkels on January 17, 2008 at 1:17 PM | Permalink
| Comments (0)
AquaLogic Enterprise Repository (bundled today with AquaLogic Service Registry in the AquaLogic Registry and Repository) provides a level for business semantic content, and can build upon the source code repository you use today (CVS, SVN, ... ). But rather than discuss the features of ALER, let's look at problem domains that people are trying to solve.
UML - Program modeling in UML and other tools starts with diagrams, and evolves to software. How do you track the ROI of UML modeling? How do you assure the process of updating UML diagrams as the software changes? ALER can help with those issues.
ISO-9000 - To be compliant, you need to be able to produce documentation and software lineage to substantiate that the testing and process have been followed properly. Asset values on testing, documentation, training coverage, inspection, and other ISO-artifacts can be related to your software. ALER can help you do that and to assure that your ISO-9000 compliance tools are being applied as they should.
Financial Due Diligence - we've all heard the story. "We have 3 man millennia invested in our product. When we IPO, it will be impossible for the competition to catch up." How could you substantiate that? How could you follow the software development ROI more closely to get figures on not only how much effort is put into a software project, but the effective effort and saved effort? How could you substantiate your off-shoring strategy's success? Code per unit cost? What is the Net Asset Value of your software? Resulting bugs per product? Returns per unit sale? ALER can be configured to work with your metrics to determine the asset value of your software as it pertains to your business.
508 mobility compliance - when working with the Federal Government, it is important to be able to provide handicapped accessibility to your software products. How do you coordinate and estimate the cost of, and scheduling of, mobility compliance? For example, if you use old DOJO widgets in your software, that may not be compliant. However, a new UI toolkit can be selected to be compliant. What is the impact of changing that? How would you coordinate the visibility into your project to accomplish that? ALER's flexible schema allows you to track time, cost, trouble tickets, earnings, compliance, customer satisfaction figures, and any other enumerable values.
Total Quality Management (TQM) -- in total quality management, you will want to be able to investigate the error rate by component software contribution and overall test coverage. By integrating your automated tests and testing strategy into ALER, you can produce lifecycle charts of how software is proceeding towards acceptance levels.
ALER is a very exciting product because it can produce value for development, finance, professional services, and support. It is the knowledge base of how your software projects are doing.
Unlike many other enterprise software products like Struts or JSF, ALER does not need to be designed into the base of the project. In fact, some of the highest ROI is seen by customers who have an existing product deployed in numerous places, and it needs to be changed. COBOL -to- Java. Stand-alone -to- hosted. Locally-clustered -to- Geographic failover.
Queriable Binary XML Encoding
Posted by fmikkels on January 14, 2008 at 11:06 AM | Permalink
| Comments (0)
Format Discussion
The format for binary XML is an extension of UTF-8 encoding. Starting with how UTF-8 is encoded, additional shadow formats are inserted to allow encoding primarily of integers and XML tag structure.
The UTF-8-OB formats are integer formats. In the UTF-8 standard encoding, continuation bytes are prefixed with the bits "10", and by reading backwards in the stream up to six bytes, usually less, you can find the initial byte prefixed by 11...0..., n introductary 1 bits, one 0 bit, and the first segment of data content. The number of introductary bits indicates the overall length of the UTF-8 encoding. If a byte beginning with "10" is discovered intra-stream, the limits of the encoding can be interpreted.
If a byte beginning with "0" is found, thenby reading backwards in the stream up to seven bytes, much like the previous case, then it can be determined if it is a UTF-8 1-byte character, or a final byte of the UTF-8-OB format.
Reading from the beginning, or dropping into the middle of a buffered stream allows for the lexical-reorientation of the position in the stream. This may be valuable in a multi-threaded environment that is reading a minimally-hierarchical document in parallel, for example, client identification records.
UTF-8-OB can extend to any length to support any positive integer. Roughly speaking, two decimal characters can be replaced by one byte of UTF-8-OB encoding. With the format, a number (or any value) that is represented in multiple attribute or child values can be further reduced by creating an overlay mapping.
Individual UTF-8 Character and Extensions
The standard UTF-8 format describes how characters are
encoded. These are noted in the rows that display the form "UTF-8".
|
Bytes
|
Codes
|
Form
|
Bits
|
Digits
|
|
1
|
0zzzzzzz
|
UTF-8
|
7
|
2.11
|
|
2
|
110yyyyy 10zzzzzz
|
UTF-8
|
11
|
3.31
|
|
2
|
10aaaaaa
0bbbbbbb
|
UTF-8-OB
|
13
|
4.02
|
|
3
|
1110xxxx
10yyyyyy 10zzzzzz
|
UTF-8
|
16
|
4.88
|
|
3
|
10aaaaaa
0bbbbbbb 0ccccccc
|
UTF-8-OB
|
20
|
6.05
|
|
4
|
11110www
10xxxxxx 10yyyyyy 10zzzzzz
|
UTF-8
|
21
|
6.51
|
|
4
|
10aaaaaa
0bbbbbbb 0ccccccc 0ddddddd
|
UTF-8-OB
|
27
|
8.14
|
|
5
|
111110uu
10wwwwww 10xxxxxx 10yyyyyy 10zzzzzz
|
UTF-8-EX
|
26
|
8.31
|
|
5
|
10aaaaaa
0bbbbbbb 0ccccccc 0ddddddd 0eeeeeee
|
UTF-8-OB
|
34
|
10.24
|
|
6
|
1111110t
10uuuuuu 10wwwwww 10xxxxxx 10yyyyyy 10zzzzzz
|
UTF-8-EX
|
31
|
10.29
|
|
6
|
10aaaaaa ...
0fffffff
|
UTF-8-OB
|
41
|
12.35
|
|
7
|
10aaaaaa ...
0fffffff
|
UTF-8-OB
|
48
|
14.45
|
|
8
|
10aaaaaa ...
0fffffff
|
UTF-8-OB
|
55
|
16.56
|
|
9
|
10aaaaaa ...
0fffffff
|
UTF-8-OB
|
62
|
18.67
|
|
10
|
10aaaaaa ...
0fffffff
|
UTF-8-OB
|
69
|
20.77
|
|
11
|
10aaaaaa ...
0fffffff
|
UTF-8-OB
|
76
|
22.88
|
|
12
|
... UTF-8-OB
can extend to any length
|
|
|
|
|
|
|
|
|
|
|
1
|
11111111
(FF)
|
UTF-END
|
--
|
--
|
Extending the UTF-8 encoding to encodings that are hinted at
by the format are represented by "UTF-8-EX", UTF-8
extensions. These show five- and six-byte representations. These are by default
ways to encode integers up to 10 decimal digits long.
UTF-8-OB encodings show how the out-of-band encodings can
occur. These formats allow for non-character encodings of arbitrarily long
form. The minimum length of these forms is two bytes.
UTF-END, coded by 11111111 (FF) is special in the above
encoding. At no time, will the byte oxff be used for
anything other than UTF-END, and cannot occur within any other encoding.
The complete list of encodings above is called UTF-8-XX. All
encodings that except the UTF-END code together form
the set of the UTF-8-FF characters, or x-FF for short.
Overall Document Format
|
Part
|
tag(format) ...
|
Arg(format) ...
|
|
VERSION
|
(version)
|
(no tag)
id(x-FF)
|
|
|
|
|
|
PROMOTE
|
section
|
section(x-FF)
// this defines <section>
|
|
|
char(x-8), len(x-OB)
|
|
|
|
char(x-8), len(x-OB)
|
|
|
|
char(x-8), len(x-OB)
|
|
|
|
|
|
|
METATAG
|
section
|
<section>
// now in definitions
|
|
FORMAT
|
format
|
<fmt> format(x-FF) value(x-FF)
|
|
SPACES
|
switch-spaces
|
<sws>
|
|
|
relocate
|
<rel>
|
|
|
tag
|
<tag> name(x-FF)
|
|
|
tag-ns
|
<tns> name(x-FF) namespace(x-FF)
|
|
|
xmlns
|
<xns> name(x-FF) namespace(x-FF)
|
|
|
attribute
|
<atr> name(x-FF) value(x-FF)
|
|
|
attribute-multi
|
<atm> name(x-FF) value[x-FF]<FF>
|
|
|
attribute-multi-spaces
|
<ats> name(x-FF) value[x-FF]<FF>
|
|
|
attribute-ns
|
<atn> name(x-FF) ns(x-FF) value[x-FF]<FF>
|
|
|
attribute-ns-multi
|
<anm> name(x-FF) ns(x-FF) value[x-FF]<FF>
|
|
|
attribute-ns-multi-spaces
|
<ans> name(x-FF) ns(x-FF) value[x-FF]<FF>
|
|
|
child
|
<chd> value(x-FF)
|
|
|
child-multi
|
<chm> value[x-FF]<FF>
|
|
|
child-multi-spaces
|
<cms> value[x-FF]<FF>
|
|
|
comment
|
<kom> value(x-FF)
|
|
|
comment-multi
|
<kmm> value[x-FF]<FF>
|
|
|
comment-multi-spaces
|
<kms> value[x-FF]<FF>
|
|
|
|
|
|
LEXICAL
|
section
|
<section> // now in lexical compression
|
|
|
<rel> n
|
|
|
|
<c><c><c>
... <FF>
|
|
|
|
<c><sws><n><c><sws><FF>
|
|
|
|
<rel><n>
|
|
|
|
char int char int <FF>
|
|
|
|
<fmt><"fmt"><n><FF>
|
|
|
|
char char ... <FF>
|
|
|
|
|
|
|
BODY
|
section
|
XINT
|
|
|
... nested
content ...
|
|
Continue Reading...
Polymorphism, Java : Sets and Maps ... any ideas?
Posted by fmikkels on December 30, 2007 at 8:42 PM | Permalink
| Comments (3)
Background ... Over the holidays, I have been working on a binary representation for XML that is queryable and compressed. This doesn't particuarly clarify the set-up, but it's a start !!
In the organization of the XML, text, meta tags, attributes, and all the other parts of XML would be represented by a unique number. Text that is repeated would have a smaller number. Meta tags could provide mapping for concatenated values, and each of the constituent parts would be mapped to a unique integer. Integers are mapped to integers automatically, the mapping for these would not need to be provided. However, if a 12-digit number occurred 100 times in one XML document, it would be more compressed to map a 2-digit number to it.
Numbers would ideally map to numbers, and Strings would ideally map to their hash values, and in most cases, the specifics of what is mapped can be worked out by these simple rules. For the data gathering phase, I envisioned using Set operations to note that a value was seen and how it was mapped. For this I would like my class to implement the java.util.Set interface. Then, the caller of the class does not need to determine what the value will map to and can perform a set "add()" operation.
When the value is dropped into my java.util.Set-enabled class, there is a small chance that the ideal form cannot be selected. When rendering the value later, I would need to determine what it was really mapped to. For that, I would also like my class to implement the java.util.Map interface.
I had never used this aspect of the API, and it hadn't occured to me, but the "remove" method defined on these two classes has a different signature. java.util.Set.remove(Object) returns "true" if the element was in the set, and java.util.Map.remove(Object) returns the value Object was mapped to, null if otherwise. Due to the rules of Java which do not allowing return types to be different with the other aspects of the signature the same on inherited methods, I could not make a class that implements something that behaves both a Map, and as a Set because of that one signature.
The approach I'm taking is to construct a class "SetMap" that implements the union of all methods in the Set and Map interfaces, choosing the favored form for remove() to be the Map signature. Then, I created two other classes, "SetMapAsSet" and "SetMapAsMap" to provide an identity mask to allow the data to be viewed with the proper polymorphic prism.
Ultimately ... I may just use the Map signature and require the user of the class to know during the tag and data gathering phase that they need to map to values explicitly. However, it did bring an interesting thought experiment to bear about how to implement polymorphism in situations where things line up logically (like an Airplane that is also an Asset) but not exactly at the technical level of the signature.
Addendum A:
Wikipedia's Reference to the Liskov substitution principle, is an interesting discussion point. I'm not sure how it applies to multiple inheritence or multiple interface inheritenace as supported in java in general where you're explicitly desiring to present two object personalities.Having stepped through every method of each interface, everything was aligned with my unified definition except "remove()". Which then got me wondering, if you were in a situation where you had to present two different interfaces, and a minor technical conflict on a return type (that you never used) was the only thing preventing it, what would you do?
rev 1.1
Artificial Intelligence! Software's Corvair and Cold Fusion all in one.
Posted by fmikkels on December 8, 2007 at 10:16 PM | Permalink
| Comments (0)
Science pretends to be open-minded and look beyond the Public Relations and the Hype to the essence, but it doesn't. I recall the story of John Newland's discovery in the mid-nineteenth century that elements followed the rules of music, and elements, just like notes, have similar tonal er, um, chemical properties every eight notes, um, ... elements.
Laughed at? You bet. The scientists tore him apart so badly that when Dmitri Mendeleev codified the Periodic Table of the Elements, it is said that particular steps had to be made to assure any similarities to that (ha! ha! ha!) Theory of Octaves wouldn't be too obvious.
And, having read this story of Nerd History from long ago, I wondered if this episode set up schools of scientific thought that later would have difficulty accepting spectral analysis, DeBrauligh's theory, relativity, and string theory. Each of these subsequent theories dealves into the wave-like or harmonic properties of matter.
In software, we have the history of Artificial Intelligence. I must disclose right now that my pre-roots are in A.I. My first programming language was LISP. My older brother, Carl, weened me onto software when I was 10 or 11 and to keep me busy when I would visit him at Project MAC, he'd put me up at the card punch machine to write programs, and occasionally, he'd stop hacking (it was a good term then) long enough to run some cards for me and let me debug more.
Along the way, I came to know the giants--Stahlman and Greenblat--though they may not remember me . But A.I. was where I started my trek down computer lane. Quickly, though, I noticed that "A.I." and "Rocket Science" were somehow bad things in the industry. "A.I was hard", and "A.I. 'failed' because there wasn't a killer application that used it."
With programming paradigms based on registries and configuration files, and multi-versioned complex packages, things will proceed more slowly than they could if the industry could re-capture the promise that was A.I. Like the "Theory of Octaves", I know it won't be called "A.I."
I think it will be called Governance.
Governance is the means by which application composition, behavior, and operation are modified by the "environment" in which it is running. As the environment becomes more complex, users will find themselves standing in the Library of Congress with no card catalog staring down stacks and stacks of components and decisions and not knowing which one to take.
Users will want suggestions. Which service provider should I hook up with? Should it be deployed on a VM? or on a remote server? I see A.I. as being key for allowing the Web 2.0 notion of an Ad-Hoc application to be developed without the programmer knowing the nuance of every component API level and component configuration file.
Linear Programming! Index of Activity!
Posted by fmikkels on November 7, 2007 at 2:15 PM | Permalink
| Comments (0)
I'm looking over my STATS to pick up any information that may be valuable in tuning my blog for my market -- enterprise application developers. Here are the hit rates on my blogs so far:
To make sense of this data, I employed linear programming and solved for simultaneous linear equations. Accounting for the number of hits over the time an entry has been published, I find the correlation of hits related to the punctuation involved in title. - has negative value, ! gets the most attention, follwed by ? and finally an apostrophy has the value of a letter.
This is a silly case of applying these algebraic techniques, but real-world examples of applying linear programming occur all the time.
I remember in high school learning linear programming where you would solve for variables and unknowns based on complex information that is given to you. Statisticians would use it to calculate the prediction of income of a household based on the number of appliances and chairs around the dinner table and things like that.
A more common form would allow you to make decisions based on the ROI and cost of different commodities.
Example: mixing spices in a factory to maximize yield
You want to mix the most profitable batch of a spice, and you have a few product sources to choose from. The Law requires that 97% of a spice would be free from weeds. The law prohibits you from purposely mixing contaminates into a spice, but you can mix batches of varying purity provided no sticks and stems and weeds are added directly.
You have a source of spice that is 99.5% pure for $1000 a ton, and another that is 93% pure in the same spice for $550 per ton, and you have a previous batch that is 98% pure and not selling in the warehouse, but it would cost $100 to transport back to the mixing plant per shipment of 1000 pounds. You must also adjust the mixture of two preservatives of up to 1% of volume that cost $30/pound and $50/pound provided the first is not used in a proportion less than 2:1 to the second. If your spice is 99% pure or more, you can use as little as 0.25% preservative. However, if it is 99% or less pure, you must use 0.5% preservative or more.Given these restrictions, what is the cheapest cost per pound?
I remember in High School, circa TRS-80-days, solving this type of problem is what computers were expected to be able to do. This was decision support. Where has this gone?
I've been searching around the internet looking for good implementations of linear programming and solving linear equations, and haven't found any. I find it interesting that software had not been developed to solve the problems in the way we were taught in high school to solve problems.
Two questions (or three). Do you know of a good linear programming package? Would you use a package like this?
Did you come to this blog entry because the title had extra punctuation?
version: 1.1
BEA Guardian Rocks!
Posted by fmikkels on November 2, 2007 at 1:31 PM | Permalink
| Comments (0)
BEA Guardian is not magic, but it replaces magic. Sometimes, there's a rare IT professional that has that sixth sense that A and B are not in alignment, and knows to fix it before there is a problem. Usually that IT magician doesn't exist. But BEA Guardian does exist.
One bit of exciting news from BEA World in Beijing ("Peking" to many Europeans) is that BEA Guardian is now free to most BEA customers. This means that most BEA customers can use Guaridan to solve issues before they occur and respond quicker if a problem does occur.
In IT, there are three tasks to be avoided. Guardian helps with all of them.
Task 1: Combing through release notes and configurations to solve a problem: Your application is complex and contains products from numerous vendors and open source communities. Some versions work better together than others. Guardian identifies configurations that do not work together well.
Task 2: Submitting a support ticket: When you experience a problem or have a question, you may want to contact BEA Support. The first thing support needs is usually a complete inventory of the software you're using, the configuration files, and log files. From Guardian, you can submit a ticket to BEA Support, and the proforma information will be automatically submitted for you. If you're bumble-fingered like me, this could save you ten minutes on each ticket.
Task 3: Responding to a crisis:Your application is complex, and there are many resources that need to be allocated between application servers, web servers, database systems, open-source frameworks, and your application. Guardian can be used to periodically check your system for the signs of problems. Sometimes, a situation can creep up over an extended period (e.g. a JDBC memory leak) and sometimes, a fix may have been made before you've even encountered the issue. Guardian handles both those types of cases. Consider this. If there were a way to avoid one unplanned outage this year, what would that mean to you?
Guardian is an extensible framework of pattern matching. BEA Support maintains and provides THOUSANDS of pattern signatures for you to use, and you can make your own. It hooks into WebLogic Server and other products using various technologies. (You can read about those).
Look up Guardian at the
Guardian Homepage. The is also a
whitepaper with a nice Q & A at the end. The brochure is also interesting place to start.
rev 2.0
Rethinking Algorithms
Posted by fmikkels on October 1, 2007 at 9:18 AM | Permalink
| Comments (0)
I have been working on my programming language, "Just", the Java Smalltalk compiler. I wrote it originally in 2002 to allow Smalltalk-like expressions of blocks and iterators. Over time, I relaxed the full Smalltalk-80-type specification and went to a "best-fit" Java Reflections API approach to execute these instructions.
For ad-hoc, and algorithmic-oriented work, I find the Smalltalk-like specifications to be more flexible primarily because all data items are objects and all collections inherit common collections frameworks. For example:
collection do: [ :x | System out println: x ]
prints out all of the contents of a collection separated with carriage returns. Which is increadibly more easy than constructing a for- or while-loop, on an enumeration (if it's an enumerated list) or an iterator (if it's iterator oriented), or "[x]" if it's an array, or ....
For the compiler, I was working with standard compiler options like the LR-1 type compiler and dissecting things into terms, expressions, factors, etc. as I had been taught in compiler class. I found I was creating dozens of classes with abandon for all the data forms that would be parsed. I started restructuring the class hierarchy to simplify -- merge things that were similar. I began re-thinking the reason for those algorithms.
In the 1950s, when Donald Knuth, Grace Hopper and that generation of computer legends was working, 8KB of memory was a large amount. CPU time was expensive. And disk I/O (or tape or punch-card I/O) was very slow. I was reasonably sure that it would be rare for me to have one source block to compile greater than 5 MB, and all that could fit into memory.
I went back to the previous generation of computer heroes, Alan Turing, and the Turing Machine for ideas of a compiler. Alan Turing had a notion of the computational tape and moving back-and-forth using a state machine and substituting tokens on the tape for other tokens. I could try that.
I then dissected the "tape" of code using simple operations applied in a logical order of succession.
Comments are removed first
Strings and Character Constants are removed. (things in Strings can look like code macroscopically but not really be code)
Using an iterative approach, nested structures are removed with the rule that the innermost pair will be processed first. a) Groups in Parens, b) Groups in Brackets, c)
Groups in Braces. Each of these will be further pushed into for processing
as each item is identified, it is replaced with a token that is a sequence of characters that is out of the bandwidth of standard processing. In my case, I chose a digit, two letters, and digits format to identify the type of tag removed. E.g. 0pa123 for parenthesized term 123.
More rigor would be needed to make this production code certainly, but as a debugging tool or a hacker's desk-top assistant, it is quite helpful.
The take-away that I am putting forward is that this alogrithm is not 'efficient' in the classical sense, but it is very compact and fast. A two GHz processor can traverse a 100KB string in memory perhaps 2000 times per second. A code generator that generates code to parse code will be more compact and faster parsing across a String and repacing tokens than generating multiple Java classes to do push-down parsing. In a jar file, there is a minimal overhead for each class, and there is less overhead for a method. The types of problems that the 1950's compilers were solving are not as prevelant.
This is particularly true in the area of data transformation in which one business record may be ten or twenty thousand bytes, and deconstructed as objects may be 200 objects or more fully decomposed. Turing-type access into data records for read and write may, overall, be a better use of resources because the data is more compact, and the page-jumping in memory will be lower because objects will not be traversed as frequently.
I have seen database internals take a approach. Within one disk page, several short database records may be stored. Because some data fields can vary in size, the records are truncated to required length, and written to the disk page. If any of the records on that page are accessed, then all of them on that page are read out, and the desired record is decompressed into the form expected by the schema. There was more to it than this to handle over-flows, but this was essentially the algorithm.
CORBA took a similar approach for object references. Because references may be lengthy, if the same reference was passed multiple times, the second and third references would be a reference to the reference.
I was working in a Java applet environment needed to access XML fields in JRE 1.1. Downloading packages of W3C XML parsers was not going to be acceptable, and really there were only a couple fields to check. Using similar techniques, I was able to traverse and extract the XML information, and construct and return XML information very compactly.
Looking at the computing capacity of CPUs today, and the abundance of memory, I put forward the notion that constructing or using Java packages to handle simple point-purpose parsing may not be the best final-form in a generated system. As mash-ups become more complex and diverse environments become more widespread, the need for more packages, versioning those packages, and allocating runtime space for those packages may be heavier than the lexical parsing indicated. Lexical parsing may work better on strings than "traditional" object decomposition.
JSR 168 vs. WSRP
Posted by fmikkels on August 30, 2007 at 5:58 AM | Permalink
| Comments (0)
I was talking with Nick about updating his blog written on August 23, 2005. So many new advances that made this topic even more relevant today. Nick said, "go for it." With vast amounts of endorsed copying, here it goes:
JSR 168 is a API specification and WSRP is a messaging specification. Okay, so what does that mean? In general, an API specification is important when you want to build something and messaging is important when you want things to interoperate with each other. In this specific case, JSR 168 provides developers a way to write a portlet that can be hosted on any portal that supports this standard - this includes most Java based portal products like WebLogic, WebSphere, Oracle etc. This relationship is very similar to the one between servlet and application server. Note that this also means that a JSR 168 portlet will not run in a Microsoft Portal. WSRP on the other hand, just specifies how the portlet content can be transferred in a message. It makes no assumption on how the portlet was actually built. To use it, you only have to understand the message and render it using whatever mechanism you have in your environment. Or if you are really smart, you can look at the message, parse it and understand it on the fly :)
This is similar to a web service infact WSRP is a web service. So why would you use WSRP - for all the same reasons that you use a web service the main one being interoperability. With WSRP, I can integrate a portlet running on Microsoft, BEA or some other proprietary portal server on my portal server. Just like you can take C++, EJB, CORBA, etc. and expose it as a web service, you can take a JSR 168 portlet, a portlet based on ASP or JSP, struts etc. and expose it as a WSRP portlet.
WebLogic Portal 10 (WLP) and AquaLogic User Interaction (ALUI) are both integrally connected with the JSR 168 and WSRP technologies.
Either can produce or consume portlets, and either can consume AJAX or HTML web sources. WSRP and JSR-168 are well supported at BEA from both the producer and consumer side. In the scheme of distributed portal application development, understand the control you have over your portlets and aggregated portal data:
WLP is the most-tightly integrated portal solution for the WebLogic Server and the WebLogic products such as WLI. WLP is the correct choice if you are raising portals or portlets from WebLogic Server.
ALUI is the most flexible portal solution allowing data to be sourced from virtually anywhere. If you are creating a community or a "dashboard" application that will source data (create portlets) from as-yet-unknown sources that may not be J2EE-, 168- or JSRP-compliant, ALUI is the one to choose.
Why Isn't Everything a Portal?
Posted by fmikkels on August 16, 2007 at 9:48 PM | Permalink
| Comments (0)
ALUI is FANTASTIC ! I've started to think about ways that these technologies will really help my ISV software customers.
ALUI can front-end just about any of your existing HTML applications. Modernization of UI look-and-feel is two steps ahead when put through an ALUI portal. The UI can immediately be put through governance, usage tracking, and updated style sheets. You may not have had visibility into which portions of your application are used. With ALUI, that mystery can be reduced.
Reusable AJAX components can also be made in the form of portlets. Portlet-to-portlet communication immediately supports asynchronous portlet updates out-of-the-box. You do not need a complex AJAX platform, or, if you have existing AJAX components, those can be used as well.
Document indexing, integration with Business Process Management (ALBPM), analytics, and an extensible framework for adding new capabilities such as AquaLogic Pages, AquaLogic Pathways, and AquaLogic Ensemble. Capabilities you or your customers don't need do not need to be installed, but they are available when needed.
For Independent Software Vendors (ISVs), ALUI provides a mechanism to present your application, regardless of complexity, securely and efficiently. Deployment time and customizations are simplified through the IDK over "hard" integrations through JSPs, Servlets, application integration, and so on.
Juser security integration through LDAP, Active Directory, or through the portal itself is supported. Every integration point from users, to browsers, to applications, and across multiple languages are simplified. Deployments are normalized and more predictable.
Soft integration is enhanced. This is the side of integration that makes software comfortable and convenient to use (luxurious) rather than merely utilitarian. Widgets and do-dads that are commonly available in the internet world, charts, mapping integration, calendar controls, AreaCode lookup tools, and what-not, can be sprinkled in. These are the touches that will leave customers saying, "you really took the time to understand our needs", (and it will be relatively small project).
There is a strong benefit for an ISVs that embrace portal technology as a deployment vehicle first. Because portal frameworks easily and conveniently include plug-ins from other applications, the first ISVs to provide this will the the "anchor application" that other apps will be plugged into. In the composite application arena, the application provider who introduces the portal integration mechanism will probably have top-branding. All other application vendors will be components plugged in.
Semantics is Back
Posted by fmikkels on July 26, 2007 at 9:11 PM | Permalink
| Comments (0)
I recently read Bill Roth's JDJ story on Semantics. It is not a new idea, but big again. I feel that semantics has been under-utilized in computer programming. Primarily, it is under-utilized because semantics has little meaning without the context of intent, which is often hard to capture, and it's often non-algorithmic, meaning that a fully automated semantically-driven operation will probably fail at some point--the unstructured brings the unexpected.
Data transformation is a task that has slowed down systems integration for as long as there has been systems integration. If you're transforming from a record that has a field labeled first_name, to a record with a field named firstName, that transformation systems could automatically map that for you. As of a few years ago, the common tools I used would not do that, and wouldn't even map "firstName" to "firstName". By whittling away obvious choices, finding near matches, and possible matches, it seemed to me that a fair mapping of master-detail to master-detail could be done relatively quickly--especially if sample data were available in addition to the record schemas and descriptors. The basic premise of a transformation system is that the contents of the records are close and can be mapped. For a development tool or a plug-in connector for a web component in a rapid developement environment, that seems quite valuable.
From a semantic standpoint, our culture already recognizes (123)234-3343 as a phone number. 987-65-4321 as a SSN. 123.234.3343 may be a phone number, though 123.234.33.43 is probably an IP address. Real Estate Addresses are quite uniform and routinely parsed by mapping programs. The rules for the routine analysis of business and network data are quite good.
My Semantic Experience
I had an opportunity to employ a semantic analysis as a fifth-level support engineer for a wireless telecom provider. My job was to figure out how to solve tickets. The "long pole" in solving problems turned out to be just figuring out what was broken. Trouble tickets would be issued, or emails, or spreadsheets, or XML files with hexadecimal encoded values, and these may have been supplemented with additional notes on things tried, but usually getting to brass tacks, the phone, the billing account, and the provisioning system were all that was needed. It was either right or wrong in each of those systems, and no amount of trouble ticket commentary would change that.
One failed order may have 100 phones on it. A report from an enterprise customer may have a spreadsheet with 200 phone numbers on it. With a simple Java application I wrote, I would copy and paste the text of whatever had been presented to me in one window. Without pressing a button, the code would gather all the phone numbers, trouble ticket numbers, process IDs, GSM IDs, and other information that could be lexically interpreted. The output was a query to the ticketing system to retrieve those problems that were still outstanding. If the phone number, or trouble ticket ID, or GSM id, or a couple other values led to a ticket that was assigned to me, I'd fix it.
It turned out that you could identify large numbers of problems that could be fixed just by retrying the operation that failed. I extended the tool to figure those out, and produce the retry script. The off-shore team eagerly grabbed this tool because they could solve 100 tickets in the time it would take to solve 4. Their support numbers greatly improved overnight.
Web 2.0 and Semantics
Very exciting in the Web 2.0 world is the ability to inject semantics. We've all seen smart tags and sites like Digg and del.iscio.us add the injection of tagging. A tag keyword is, essentially, a semantic of what the page means to someone.
Tools like the one above could be built onto an existing system. And, the architecture committee could regulate the APIs accessed through governance. The back-end system integration can be pure and written with the goal of flawless operation. Engineering itself cannot take the time to identify all possible errors and which ones are trivially solved versus those that are difficult to solve. They'd never get anything done.
The collaborative effort of solving and fixing problems can do that better on the Web 2.0 side. The problems to solve do change with each update of the system. The legacy of working around every trivial production problem that ever was encountered does not want to be incorporated into the master system design.
 |
 |
March 2008
| Sun |
Mon |
Tue |
Wed |
Thu |
Fri |
Sat |
| |
|
|
|
|
|
1 |
| 2 |
3 |
4 |
5 |
6 |
7 |
8 |
| 9 |
10 |
11 |
12 |
13 |
14 |
15 |
| 16 |
17 |
18 |
19 |
20 |
21 |
22 |
| 23 |
24 |
25 |
26 |
27 |
28 |
29 |
| 30 |
31 |
|
|
|
|
|
Search this blog:
Archives
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
August 2007
July 2007
Categories
Product: AquaLogic Enterprise Repository
Product: BEA Guardian
Product: WebLogic Server
Role: Architect
Role: Platform Admin
Technology: Persistence
Technology: Service-oriented Architecture
Technology: SOA Integration
Recent Entries
Top K Queries ... some ideas
SaaS! Virtualization! Surveys?
Between iPhone and Air ! ...

|