Genii Weblog

Limits to cleverness

Mon 20 Dec 2004, 09:01 AM



by Ben Langhinrichs
Warning: This post has been rated Über-Geek.  Parents and Employers should be appropriately cautious.

In Top That, Rocky!, I talked about using nested @Transform statements to make a pretty comprehensive column formula for sorting full names into the appropriate lastname, firstname format, so that we could sort by last name. ( I should emphasize that this would not handle Maria Von Trapp, with the Von capitalized, without adding a bunch of dangerous assumptions, such as a list of potential middle parts.)

The Question Raised (or "New Challenge")
A friend wrote and asked a follow up question which I thought I could be terribly clever with.  He wanted to know how I could sort a list of names by the lastname, the way I had done, but show it as a list of full names.  Thus:

Joe Litton
Cindy Lou Who
Mary Jane van der Welten
Tom Duff
Ben Langhinrichs
Harry Belafonte

should be listed as

Harry Belafonte
Tom Duff
Ben Langhinrichs
Joe Litton
Mary Jane van der Welten
Cindy Lou Who

The Clever Solution
My first reaction was that this was a great excuse for using the new @Sort with the [CustomSort] logic, and it certainly does work.  The code I came up with is:

sorted := @Sort(Speaker; [CustomSort];
@If(
   @Do(R:=@Transform(@Explode($A; " "); "W";
      @If(@Matches(W; "{a-z}*"); W+@RightBack($A; W)+", "+@LeftBack($A; W); @Nothing));
         @If(@Trim(R) != ""@Trim(R)[1]; @RightBack($A; " ")+", "+@LeftBack($A; " "))) >
   @Do(R:=@Transform(@Explode($B; " "); "W";
      @If(@Matches(W; "{a-z}*"); W+@RightBack($B; W)+", "+@LeftBack($B; W); @Nothing));
         @If(@Trim(R) != ""@Trim(R)[1]; @RightBack($B; " ")+", "+@LeftBack($B; " ")));
   @True@False));

which is hard to read, but is really just the @Transform I used in the previous post to split up the last name and first name, and then a comparison (see that sneaky little > between the @Do's), and return @True or @False back to the @Sort.  Very cool that it works, but a bit too clever, I'm afraid.  I started thinking about the fact that in a complex sort, the elements might be retrieved repeatedly and compared over and over again, and I realized that this is probably going to bog way down with a longer list.  Each @Transform has to be done many times for the same name, and that is just expensive.

The Smarter Solution
So, I scaled back and wrote got rid of the [CustomSort].  This is simpler, shorter, and should be much faster:

sorted_by_last := @Sort(@Transform(Speaker; "S";
@If(@Do(R:=@Transform(@Explode(S; " "); "W";
          @If(@Matches(W; "{a-z}*"); W+@RightBack(S; W)+", "+@LeftBack(S; W); @Nothing)); 
                  @If (@Trim(R) != ""@Trim(R)[1]; @RightBack(S; " ")+", "+@LeftBack(S; " ")))));
sorted := @Trim(@RightBack(sorted_by_last; ",")+" "+@LeftBack(sorted_by_last; ","));

It falls a little lower on the coolness continuum, because it doesn't manage to use @Transform as part of a [CustomSort], but fast is cool too.  It works by creating the sorted list with the lastname, firstname format, then undoing it in the last line.

Now, for those few desparate souls who are still reading, I'm sorry that you lack anything more constructive to do with your time.  But, since you don't, I have a question.  Would the formula above be faster or the one below?  Can you understand what each does and what the differences are?  Would the latter be safer in any way?

The Other Possible Smart Solution
sorted_by_last := @Sort(@Transform(Speaker; "S";
@If(@Do(R:=@Transform(@Explode(S; " "); "W";
          @If(@Matches(W; "{a-z}*"); W+@RightBack(S; W)+", "+@LeftBack(S; W); @Nothing)); 
                  @If (@Trim(R) != ""@Trim(R)[1]; @RightBack(S; " ")+", "+@LeftBack(S; " ")))));
sorted := @Left((Speaker+"$")+sorted_by_last; "$")

And for those still not asleep or trying to get at that awkward itch you always develop right after they put the straitjacket on, do you have a better way still?

Copyright © 2004 Genii Software Ltd.

What has been said:


254.1. Damien Katz
(12/20/2004 10:21 AM)

This is so cool! Finally, years after I created all that stuff I'm seeing examples of what it can accomplish. I've never actually had occasion to use anything of the new stuff, so this is the closest I've come to seeing it in action.

Which ones faster? I don't know, I'd guess the first probably has a slight performance advantage, but it's so hard to read that it just isn't worth it.

Also, for readability I would use more temporary variables instead of nesting the expressions so deeply. There is practically no overhead using temp vars and it might make the formulas easier to understand.

Thanks for this post Ben, really cool stuff (to me anyway)!


254.2. Julian Robichaux
(12/21/2004 03:00 PM)

Well, I won't call this a "better" solution, but I've been playing around a bit, and here's what I came up with (the formatting will probably end up getting shot, so I apologize in advance):

---------------------------------

Speaker := "Jean Claude van Damme" : "Catherine Zeta Smith" : "Tim Burton" : "Bruce Willis" : "Prince" : "Oscar de la Hoya";

REM {Add a tilde as a placeholder before words that begin with lowercase letters};

letters := "a b c d e f g h i j k l m n o p q r s t u v w x y z";

letterList := " " + @Explode(letters);

letterReplaceList := " ~" + @Explode(letters);

sParsed := @ReplaceSubstring(Speaker; letterList; letterReplaceList);

REM {If a name has any tildes, split at the first one; otherwise split at the last word};

slist := @Transform(sParsed; "S";

@If(@Contains(S; "~"); @Right(S; "~") + ", " + @Left(S; "~");

@Contains(S; " "); @RightBack(S; " ") + ", " + @LeftBack(S; " ");

S));

REM {Remove any extra tildes (for example, "John ~van ~der Schlumdt")};

slist := @Trim(@ReplaceSubstring(slist; "~"; ""));

REM {Sort the list};

sorted := @Sort(slist);

@Prompt([Ok]; "Sort By Last"; @Implode(sorted; @Char(10)));

REM {Change it back to firstname lastname format};

sorted_by_last := @Trim(@Word(sorted; ","; 2) + " " + @Word(sorted; ","; 1));

@Prompt([Ok]; "Sort By Last"; @Implode(sorted_by_last; @Char(10)));

---------------------------------

I guess if you wanted to cram it all together, you could also call it like:

---------------------------------

Speaker := "Jean Claude van Damme" : "Catherine Zeta Smith" : "Tim Burton" : "Bruce Willis" : "Prince" : "Oscar de la Hoya";

letters := "a b c d e f g h i j k l m n o p q r s t u v w x y z";

sorted := @Sort(@Trim(@ReplaceSubstring(@Transform(@ReplaceSubstring(Speaker; " " + @Explode(letters); " ~" + @Explode(letters)); "S"; @If(@Contains(S; "~"); @Right(S; "~") + ", " + @Left(S; "~"); @Contains(S; " "); @RightBack(S; " ") + ", " + @LeftBack(S; " "); S)); "~"; "")));

sorted_by_last := @Trim(@Word(sorted; ","; 2) + " " + @Word(sorted; ","; 1));

@Prompt([Ok]; "Sort By Last"; @Implode(sorted_by_last; @Char(10)));

---------------------------------

Same cat, different way to skin it. I suspect your method might be a little more efficient.

By the way, what are you using to color-code your Formula language stuff in HTML?

- Julian


254.3. Feri
(05/12/2006 05:24 AM)

simpler one:

tmp := "Joe Litton":"Cindy Lou Who":"Mary Jane van der Welten":"Tom Duff":"Ben Langhinrichs":"Harry Belafonte"; @Right( @Sort( (@RightBack( tmp; " " ) + " " + @LeftBack( tmp; " " )) + "|" + tmp ); "|" )