I am trying to create a regex pattern to recognise and highlight source code comments
that look like:
\*this is a comment;
The language ignores format so comments can occur anywhere on a line, multiple times
on a line, or over multiple lines. Other code almost always finishes with a semi-colon
except for a few exceptions that I can work on later. For now I am trying to get a
regex zero-width positive lookbehind assertion to work so multiple \*; comments in
a row can be identified. The regex mode I am working on looks like:
<SPAN_REGEXP TYPE="COMMENT2" AT_WHITESPACE_END="FALSE" MATCH_TYPE="RULE">
<BEGIN><\!\[CDATA\[(?<=\[;\])\[\s\]\*\[\*\]\[^;\]\*\]\]></BEGIN>
<END>;</END>
</SPAN_REGEXP>
I have also tried <BEGIN>(?<=\[;\])\[\s\]\*\[\*\]\[^;\]\*</BEGIN> which complains,
and rightly so, and <BEGIN>(?<=\[;\])\[\s\]\*\[\*\]\[^;\]\*</BEGIN> which fails
to highlight just like the first example.
An example of the code I am trying to highlight (SAS for the curious):
/\* see how an assinment statement with \* works \*/
data _null_;
new_var = old_var \* 100;
/\* this comment should work fine, how about the other type \*/
no_comment; \*comment; \*comment;
\*comment; \*comment; \*comment; \*comment;
\*This is also a valid comment; /\*as is this\*/ \*and this; \*and this;
run;
The regex pattern doesn't seem to work at all when used in a mode file, and only works
partially when used in the search dialog by capturing the semi-colon from the previous
statement and thereby not allowing the next comment to be captured if the comment
before it is captured. (?<=\[;\]) means a semi-colon should be present just in front
of a captured area but should not be captured itself. This way assignment statements
and SQL etc with \* are not captured. The first \*; comment after the /\* \*/ comment
fails to capture, as expected, which should only require a small regex change so ignore
that. To see how the pattern should capture the example append a semi-colon to the
regex and paste both the regex and code into:
http://www.myregextester.com/
Submitted | *anonymous - 2010-01-11 01:22:46 | Assigned | |
---|---|---|---|
Priority | 5 | Labels | text area and syntax packages |
Status | open | Group | None |
Resolution | None |
2010-01-18 16:15:19 goebbe |
Please take a look into the file sas.xml in your Jedit home directory. |
---|---|
2010-01-19 09:36:48 goebbe |
See also |
2013-02-12 11:21:53 muntjac |
The lack of support for zero-width positive lookbehind assertions also makes it impossible
to correctly highlight strings in Matlab/Octave source code, as strings can begin
and end with a single quote ' , but in certain contexts that character can also signify
matrix/vector transposition. |