OK thanks.
I thought their might be null matches.
I had already suggested different search strings that work for the customer but i just wanted to report it in case this was a bug that needed to be fixed.
OK thanks.
I thought their might be null matches.
I had already suggested different search strings that work for the customer but i just wanted to report it in case this was a bug that needed to be fixed.
Hi Admin,
I registered Actipro license at the very beginning of my application as bellow.
WinFormsControlsLicenseProvider<ActiproSoftware.Products.SyntaxEditor.AssemblyInfo>.RegisterLicense(Resources.ActiproLicensee, Resources.ActiProLicenseKey);
My application works well in most test machine. But it fails in one of them since I ever install and uninstall Actipro component using evaluation option.
1. Install the ActiPro demo version using evaluation option.
2. Uninstall it
3. Install my application
4. Start up my appliction and display syntax editor, then a license dialog is shown.
In such a case "WinFormsControlsLicenseProvider.RegisterLicense" feature is broken. I think it is a bug.
Thanks
Jiandong
Hi Jiandong,
Thank you for reporting this. We found the problem and have fixed it for the next maintenance release. In the meantime, if you email our support address, we can privately tell you how to remove the evaluation data from that one machine if it is a blocking issue for you.
What's the support address?
Hello, you can find our email addresses on our Contact Us page.
I allow the user to switch Automatic outlining on/off.
Switching it off is no problem. I set the mode to None and it goes away.
Switching it back on however does not cause the outlining to reappear.
I had expected it would re-perform the automatic outlining when I set the mode to Automatic.
I tried adding the line yntaxEditor.RefreshOutlining but that had no effect either.
Finally, adding Document.Reparse worked, but that seems like overkill to reparse the entire document just to apply the outlining.
Am I missing something here?
Hi Mike,
Thanks for letting us know about this. It appears to be due to a somewhat recent code change elsewhere that triggered this bug. We found the problem and fixed it for the next maintenance release. Now the outlining will refresh as soon as you put Auto mode back on. No other calls will be necessary.
I have sent an email to support@ days ago, but haven't gotten any response yet. Is it right?
Thanks for letting us know. Yes that is correct. For some reason the Google apps mail server flagged your message as spam so our ticket system never got it. We'll look at it and get back to you.
The following is a simplified example of a RegEx replace that one of my customers is trying to perform.
Original text:
1 x
2 x
3 x
4 x
5 x
6 x
7 x
8 x
9 x
i.e. (space)#(space)x
Find Expression: \b+([^ ]+).+
Replace with: $1
The output is:
1
3
5
7
9
while it should be:
1
2
3
4
5
6
7
8
9
I tried the same expression in Visual Studio and it results in the correct output.
We tried various versions of the find expression, adding $ to the end or ^ to the start etc.
Even tried replacing the . with [\s\w] and specifically trying to catch the line end with [\r\n] to ensure that it did not 'match' the line end.
The only thing that worked was to change the \b+ to \s+ which works OK for the specific situation where the lines start with a space ... but in some cases they dont. (and using \s* at the start produces the same incorrect results as \b+.)
Could you please take a look at the RegEx engine you are using.
Thanks
Mike
Now that my app is rolling out to customers I am starting to get reports of rather odd performance issues that were never seen with the old editor we used.
What makes them odd is that a very small change to th language definition file [these are XML defined languages] will make a huge difference to performance in one case but not in the others.
The 3 issues are:
#1 - Very large files take a long time to load/parse. (eg. 500KB)
#2 - A relatively small file [20K] that contains no carriage returns (ie. all on one line) takes a long time to load/parse, and then typing is extremely slow.
(A 20K file with CRs every 20 to 100 characters - as is more normal - has no issues at all.)
#3 - Typing into a 'block' selection of zero width spanning 50 lines is OK for the first couple of characters but then gets slowere and slower with every extra character typed.
I accept that large files may take a long time to load/parse since the language is complex.
(5 child states, 74 Pattern Groups and over 1000 Patterns. Almost all Explicit though.)
but when I replaced just 3 simple RegEx patterns with their equivalent 22 Explicit patterns issue #1 improved by 300%.
(All patterns were defining the same TokenID) but there was no noticable improvement in issue #2 or #3.
Such a small number of patterns changed should not have such a huge impact on performance - and definately not only impacting one of the issues.
An even bigger suprise was when I removed 3 tokens (out of over 1000) from one of the Explicit pattern groups issue #2 improved by 800%!
Again this change had no noticable effect on issue #1 or #3.
There seems to be no logic to what causes performance issues.
Note that BOTH performance improvemants were made to the same TokenID, and this is the only set of tokens that uses a LookBehind pattern that starts with '^' - to match the start of the line. (This is the only thing I can see that makes these tokens different from all the others.)
The 800% improvement occurred when I removed 3 tokens which each started with a period from a set of 6 tokens - the others did not start with a period.
The set was: GOTO EXIT QUIT .GOTO .EXIT .QUIT
I suspect that the issue with the very long lines is that you reparse the entire current line +/- 1 or 2 lines every time the text changes. So in this situation you reparse the entire text after every keystroke.
I think the old editor I used only parsed 2 or 3 tokens =/- the current position on key strokes. It then parsed the current line +/- 1 or 2 when the cursor moved to a new line. (Things like AutoCasing were only done after the line number changed)
The result was that long lines made no difference at all to performance.
I have not found anything that helps issue 3 but it must be related to something in the language definition as I don't see the issue when I use the much simpler SQL language from your sample app. (It does still apply to the 'limited' language I switch to when the text exceeds 200KB. This has far fewer tokens defined than the full language but still more than your demo file.)
Do you have a set of recommended DOs and DONTs for language definitions?
Hi Mike,
This sort of regex seems to work for me:
\b+([^ \n]+).+
Note that I added the \n in the character class.
Yes that works so I have forwarded it to my customer.
However I dont understand why adding \n to the 'not a space' would matter in this specific example. Since we are looking for the first space after the first word this should only be required if the first word is immediately followed by a 'new line' character... which it never is in the example.
Hi Mike,
A "[^ ]+" will consume all non-space characters, including newlines. That's why it will match forward.
OK. That's what I thought, but since the first word on every line is followed by a space character it should never reach those line end characters in the part of the pattern that is within parens.
Surely the part that should exclude the line end is the final .+ which is why we first tried adding a $ to the end so that it would match the line end. The original pattern seems to correctly stop at the line end, but then it replaces everything on the second line except for the line end character itself. (and every second line.)
Also, someone at Acrtipro had told me that your RegEx Find/Replace was designed to work the same as the one in Visual Studio, and our original pattern works correctly in VS.
Hi Mike,
1) Yes the WinForms SyntaxEditor lexes the entire document initially and after lexer definition changes, which can take time when working on a large document. Using a programmatic lexical parser can help increase speed there but is more work for you to write. The newer WPF version handles this better in general because it only lexes through what it needs to in order to display text and won't blindly lex the entire document.
You mentioned that 3 regex patterns seemed to be causing major performance issues there. What were the regex patterns? Keep in mind that if you use regex code like "foo .* bar" then it has to constantly do a ton of searching ahead (possibly to the end of the document) to see if there are any "bar" occurrences after a "foo". You can reduce that by doing something like "foo [^\n]* bar". That's one example that might help if that's a scenario you are running into. But without knowing the patterns you are using, it's hard to say. I do suspect that some refactoring of the regex patterns in some form would help though.
Also keep in mind that the dynamic lexer checks patterns in the order they are defined. So put the most common ones (and maybe least expensive) ones first in your lexical state.
2) We do lex by document line so extremely long lines can run into performance issues. If that set (.GOTO .EXIT .QUIT) is a regex pattern set, it could be refactored to be improved. I'm not sure if it is an explicit or regex pattern group. It might help if you can email us a language definition XML file and a sample file we can load up into our SDI Editor sample to see what you see and get a better idea on your pattern setup. Perhaps once you look it over again with the suggestions above. You can email it to our support address and reference this thread.
3) I'm not seeing much slowdown when I do this in our samples. But perhaps it's due to the lexer performance issues with your particular pattern definitions that is causing this.
OK I'll send the info to support.
However the problem patterns were Explicit patters - not RegEx - which is why I can't explain how 3 out of 1000 patterns could possibly cause such a huge performance difference.
Also the RegEx patterns that caused problems were simple patterns like '\.?(aaa|bbb|ccc)' so there would be no long searches.
(an optional period followed by one of 3 words)
Again, no obvious reason why that would have a major impact.
Any chance that you plan to implement the 'limited range' parsing in WinForms in the near future?
Also, any changes to make parsing more 'local', especially for very long lines, would be appreciated.
Hi Mike,
This sort of pattern also works (line start, consume spaces, match non-space sequence, consume rest of line to end):
^ [ ]* ([^ ])+ .+ $
Let me explain in more detail how the \b was causing problems. After the first replace, the pointer is after the "1". It sees a word boundary with the \n right after, which is a zero-width assertion. Then it matches a sequence of non-space, which is the \n character. Then the next character is a space so it skips to the .+ part, which consumes to the end of the line.
Our engine uses a large subset of the .NET regex engine (not VS) syntax, so it should generally be on par with results from that engine. There may be a couple very minor differences though depending on modes used in the .NET regex engine.