Proparse discussion


proparse api performance so far

I just ran the latest version of the new proparse api from subversion. I was a bit confused by the performance, I wonder if your timings are anywhere similar to mine.

This piece in ClassClient.p :

  DO lv_i = 0 TO lv_numnodes:
    ParseUnit:getnode(lv_i).
  END.

takes 21 seconds. That's quite a bit slower than expected :-(

When I empty all the codelines from the constructor in Nodes.cls, it still takes 21 seconds.
When I remove the statement in ParseUnit.cls where it says:

  NewNode = NEW proparseclient.Node(THIS-OBJECT,BUFFER TTNode:HANDLE).

What is the minimum version of progress

I've just caught myself writing some code and realized that this was 10.1C specific.

METHOD PUBLIC LOGICAL LoadFiles (p_Header AS CHAR,p_Data AS CHAR):
blobutilities:LoadBlobFromFile(p_Header,HeaderBlob).
blobutilities:LoadBlobFromFile(p_Data,DataBlob).

RETURN YES.

CATCH e AS Progress.Lang.Error :
SET-SIZE(HeaderBlob) = 0.
SET-SIZE(DataBlob) = 0.
DELETE OBJECT e.
RETURN NO.
END CATCH.

END METHOD.

what is the minimum requirement for the new class-based proparse API ?


Refactoring preprocessor directives

A parser accepts a stream of tokens, and then builds the tree. Proparse's token stream has some token types that get filtered and attached to nodes as hidden tokens.

com.joanju.proparse.DoParse.java:

filter.hide(NodeTypes.WS);
filter.hide(NodeTypes.COMMENT);
filter.hide(NodeTypes.AMPMESSAGE);
filter.hide(NodeTypes.AMPANALYZESUSPEND);
filter.hide(NodeTypes.AMPANALYZERESUME);
filter.hide(NodeTypes.AMPGLOBALDEFINE);
filter.hide(NodeTypes.AMPSCOPEDDEFINE);
filter.hide(NodeTypes.AMPUNDEFINE);

Lines and columns

(Answering offline questions)
Yes, the line and column fields in the node are indeed the line and column of the token from the original source file. Those are count-from-one.

There is also a source file reference number, referencing the array of source file names (count from zero). (See client.p, which writes out the array of source file names.)


Why do we need an index for fixed-length records

If a record is fixed length, then surely the record offset is easily calculated ?

If the node record is 32 bytes long, and the node data set starts at 1000 bytes, then record n is at position 1000 + 32 * (n - 1). Why do we need an index of node records pointing to a node record ?


Strings

Whilst I understand and appreciate the rationale behind "fixed length" records, and the need to store an offset to a string in a field position, I feel that it may be overcomplicated.

Why couldn't proparse.jar return 3 blobs: Header, Record and Strings.

The strings blob is simply a collection of null-terminated strings (no sizes etc)

FOO\0
BAR\0

Each record field that is a string could simply contain a pointer to the starting position of the string. The progress GET-STRING() will read a string until it hits a NULL, so we don't need to record the size of the string.


Schema and speed

One thing that I am thinking of that is costing us in speed terms is the schema of each record. For each node, there is a pointer to the node record. Once we have that record, we then need the schema pointers to get us to the value of each field. This is done because we need to have the capability of changing the schema around if required.

This means that

a) we need to include the schema in each blob
b) we need to read the schema to create the record and field offsets
c) each field's position has to be calculated, and stored in a variable


Changing the tree for refactoring?

I'm answering an offline question here, because I think it's of reasonably general interest for people using the parser.


What is the definition for invalid nodes

If the child node is unknown, or not valid, what is the definition ? Is it null, -1 , 0 etc ? I seem to get some strange numbers (434 if NumNodes is 431 for example).

Could it possible to set all "invalid" nodes to -1 ?

NextSibling, PrevSibling, Parent, FirstChild etc.


Proparse server config, 'prorefactor' directory

Jurjen, I'm working on the proparse.jar server configuration stuff.
I'm going to set it up so that the client can request the server to load a project's settings. This way, the client (like Prolint) can dump the project configuration automatically at start up time, and get the server to read that.
The server process will always require its './prorefactor' directory for reading and writing various configuration and other files.

PUB binary files directory


Syndicate content