PreviousNextTracker indexSee it online !

(181/211) 3974 - syntax highlighter doesn't like division. confusion with RegExp //?

~~~~
mediaSizeB=(atold_(t.value,true,true,',',["b"],false)/8)+atold_(t.value,true,true,',',["B"],false);//column 1 is media size in bytes
~~~~
vista x32 java 8.4 je 5.3
syntax highlighter (.js) shows this with incorrect highlighting from
~~~~
/8)+atold_(t.value,true,true,',',["B"],false);//column 1 is media size in bytes
~~~~
and on

Submitted jmichae3 - 2016-04-07 17:15:03.181000 Assigned kpouer
Priority 5 Labels
Status open Group minor bug
Resolution None

Comments

2016-04-07 17:18:42.426000
jmichae3

it turns blue.until it hits the first / in the comment. common problem with js syntax highlighters. parsing problem. see the ecmascript (less functions, no DOM) or javascript EBNF.

2016-04-07 18:43:01.866000
daleanson

This looks like an easy fix, it's caused by this line in the javascript mode file:

<SEQ_REGEXP TYPE="MARKUP" HASH_CHAR="/" AT_WORD_START="TRUE">/[^\p{Blank}]*?/</SEQ_REGEXP>

Simply removing this line from the mode file appears to fix this particular problem, however, It's not clear to me exactly what this line line is supposed to do.

2016-04-07 18:59:36.388000
daleanson

- **assigned_to**: Matthieu Casanova

2016-04-07 18:59:36.614000
daleanson

Matthieu, from the svn log, it looks like you added this line back in 2008 in revision 12059 and adjusted it in revision 12089 as part of work on this ticket:

https://sourceforge.net/p/jedit/bugs/2253/

I'm not up on javascript regular expression usage, but I'm wondering if there is a better way to highlight the regex without consuming a division sign followed by an end of line comment on the same line?

2016-04-07 22:52:07.830000
jmichae3

there are 3 situations where / is used:
~~~~
/* comment... */
//comment
Regexp(/regexp here, can contain \//)
string.split(/regexp here, can contain \//);
var n=z/364.25/24/60/60/1000;//comment or /*comment
~~~~
and on the var line, is that an even number of /'s or odd?

2016-04-08 22:26:19.389000
marchaefner

Parsing javascripts regular expression literals vs. division operator is notoriously difficult as the interpretation of a single `/` is context sensitive.

The concrete problem is that AT_WORD_START matches, which is almost always correct (e.g. after `,` or `=`), but not in this case since after a closing parenthesis only a division operator makes any syntactical sense. (Or to put it more parsery: regular expression literals, parenthesized expression, super/call expressions are all parsed as LeftHandSideExpression and there is no rule with two of those next to each other, as per ecma-262/6.0)

As a lazy fix one could replace the `<SEQ TYPE="OPERATOR">)</SEQ>` (a few lines above the SEQ_REGEXP for regular expression literals) with this ugly thing:
```xml
<SEQ_REGEXP TYPE="OPERATOR" HASH_CHAR=")">)(\s*/(?![/*]))?</SEQ_REGEXP>
```
This marks the closing parenthesis and a possibly following single slash (but not a comment). Although this treats both characters and any whitespace between as one token, (so far) I can not produce any unwanted side effects while testing.

Another route would be to emulate part of the actual parser logic with delegation to distinguish between states where a division operator or a regex literal would be expected. This would be rather complex, but also enable a more permissive matching of regex literals, specifically allowing whitespaces (which are legal anywhere in regexes).