>>
APLDN Home

>>
Events

>>
Trainings

>>
APL Books

>>
APLDN Links

>>
Discussion Groups

>>
Downloads

>>
Articles

>>
Library

>>
Learning Tools

>>
APLDN User IO

>>
APL2000.com




Bug Reports

Author Thread: Damaged workspaces
davin.church
Damaged workspaces
Posted: Monday, April 05, 2004 6:10 PM (EST)

If anyone is still fighting a problem with saved workspace sizes growing out of control, then you might be interested in some information about it...

 

First of all, this is caused by some internal workspace damage which leaves unused (orhpaned) blocks of data in the workspace, which seem to then replicate themselves over time and thus increasing the size of the workspace.

 

Second, this problem is probably more widespread on your machine than you imagine, because it's often difficult to detect by the WS size changes alone.

 

Third, the problem spreads from workspace to workspace via any )COPY command.  Because of this behavior, I've been calling it the first natural computer virus (for APL, at least?) because it occurred spontaneously rather than being created on purpose.

 

Ok, to simply detect the problem use the function call:

Œit 'AuditRefcountsS'

which will return a 3x2 matrix of numbers.  If all the numbers are zero, the workspace is clean.  If not, then use:

Œit 'AuditRefcountsC'

to repair the damage.  It will return the starting matrix of numbers and then repair those problems.  This operation may need to be performed several times, until all zeros are returned, because cleaning up some problems may uncover others.

 

Since this problem spreads like a virus, you should really clean up every workspace on your system (at the same time).  Also be cautious of workspaces that you receive from elsewhere (downloads, emails, network, etc.) so that your system doesn't become reinfected.

 

Good luck!


Comments:

Author Thread:
tom.atkins
Damaged workspaces
Posted: Friday, May 14, 2004 5:04 PM (EST)

Sorry for my ignorance, but I don't know what []it is. Is this a web services function?

     

davin.church
Damaged workspaces
Posted: Friday, May 14, 2004 5:20 PM (EST)
No, it's a hidden system quad-function used for internal diagnostics.

     

j.merrill
[]IT, refcounts, damaged workspaces
Posted: Saturday, May 15, 2004 2:51 PM (EST)
I don't recall exactly where I got them, but I have three "APL Technical Note" documents that I (think I) got from somewhere on the old APL200 web site. I have searched for "technical note" on the new APLDN site and have not found them. Tom, I will zip them up and forward them to you. Hopefully someone at APL2000 will see this message and decide that these (and perhaps others) should be posted here somewhere. The ones I have are labelled numbers 401, 601 and 602 -- I would guess there are others.

     

brent.hildebrand
Damaged workspaces
Posted: Monday, May 31, 2004 7:16 PM (EST)

I have now run into this for the first time that I know of.  I have a workspace where I've been using some large varaibles and working with component files.   When I save the workspace, it was about 100MB larger than the data contained in the workspace.  Copying the workspace contents did not solve the problem.  Here is the results of []IT:

      Œwssize-Œwa
109884468
      Œwssize
603193344
      Œwa
493308888

 

then

Œit 'AuditRefcountsS'
 795787 134688772
      0         0
      0         0
 Œit 'AuditRefcountsC'
 795787 134688772
      0         0
      0         0
 Œit 'AuditRefcountsC'
 0 0
 0 0
 0 0
      Œwssize
603193344
      Œwa
561904728
      Œwssize-Œwa
41288628

 

The saved workspace dropped from 138MB to 38MB. 

 

This is the first time I have seen this.

     

brent.hildebrand
Damaged workspaces
Posted: Monday, May 31, 2004 7:16 PM (EST)

I have now run into this for the first time that I know of.  I have a workspace where I've been using some large varaibles and working with component files.   When I save the workspace, it was about 100MB larger than the data contained in the workspace.  Copying the workspace contents did not solve the problem.  Here is the results of []IT:

      Œwssize-Œwa
109884468
      Œwssize
603193344
      Œwa
493308888

 

then

Œit 'AuditRefcountsS'
 795787 134688772
      0         0
      0         0
 Œit 'AuditRefcountsC'
 795787 134688772
      0         0
      0         0
 Œit 'AuditRefcountsC'
 0 0
 0 0
 0 0
      Œwssize
603193344
      Œwa
561904728
      Œwssize-Œwa
41288628

 

The saved workspace dropped from 138MB to 38MB. 

 

This is the first time I have seen this.

5.2.01  May 13 2004 11:25:37  Win/32

     

davin.church
Damaged workspaces
Posted: Monday, May 31, 2004 8:01 PM (EST)

Yes, it hides very well.  You usually don't notice it until something gets so big that you trip over it.  As you noted, it isn't repaired by )COPY -- in fact it's spread by it.  So you might want to go check all your other workspaces too to see if it's spread to them already.  You might be surprised.

 

BTW, in the []IT reports... that first number is the number of broken items found in the workspace and the second is the number of bytes they occupy.  So you've probably been generating those for a while now.

     

brent.hildebrand
Damaged workspaces
Posted: Monday, May 31, 2004 8:16 PM (EST)

When the article on []IT came out on the old APL2000 web page, I checked my most active workspaces and never found any problems.  I've checked several that I have worked on in the last week, and have several with problems.  One I worked on playing with the Colossal file system:

C:\APLWIN5.2\COLOSSAL SAVED Tuesday, May 25, 2004 03:53:14 PM
      ]refcheck
 2 128
 0   0
 0   0

 

and

 

C:\TRACKWEB\TRACKWEB11 SAVED Thursday, May 27, 2004 10:33:11 PM
      ]refcheck
 261 1946900
   0       0
   0       0

 

and

C:\TRACKWEB\MAPDATA SAVED Saturday, May 29, 2004 02:19:56 PM
      ]refcheck
 105424 15946336
      0        0
      0        0

 

The one thing in common with all of these workspaces is that I've worked with component files, both the old and the new colossal.  I have not checked every one of my workspaces, but others I have checked are OK.  Of course, when using the User Command processor, one works with old component files, so that probably is not part of the issue.  What else is common in the above data, is that I work with variables that frequently consume 10s of millions of bytes. Hm...

     

brent.hildebrand
Damaged workspaces
Posted: Monday, May 31, 2004 8:26 PM (EST)

OK - I can reproduce one.  Here is the sequence of events, starting from a Clear WS.  Note,  I have a user Command that performs []IT 'ReferenceCountS'

Note the error in the middle of the sequence of commands.

 

      )clear
CLEAR WS
      ]refcheck
 0 0
 0 0
 0 0
      'pop.cf' Œcftie 1
1
      Œcfsize 1
1 172579 44271964 0 812 0
      a„Œfread ¨ 1,¨1+¼172578
FILE TIE ERROR
      a„Œfread ¨ 1,¨1+¼172578
        ^
      a„Œcfread ¨ 1,¨1+¼172578
FILE INDEX ERROR
      a„Œcfread ¨ 1,¨1+¼172578
        ^
      Œio
1
      a„Œcfread ¨ 1,¨¼172578
      ]refcheck
 172577 29683244
 172578 29683416
      0        0

 

Here is another from a clear workspace.

      )clear
CLEAR WS
      ]refcheck
 0 0
 0 0
 0 0
      a„Œcfread ¨ 1,¨¼172578
      ]refcheck
      0        0
 172578 29683416
      0        0

Oh - this one is nice. <g>

      )clear
CLEAR WS
      a„Œcfread 1 1
      ]refcheck
 0   0
 1 172
 0   0

OR, this:

      )clear
CLEAR WS
      a„Œcfread 1 1
      ]refcheck
 0   0
 1 172
 0   0
      a„Œcfread 1 1
      ]refcheck
 1 172
 1 172
 0   0
      a„Œcfread 1 1
      ]refcheck
 2 344
 1 172
 0   0
      a„Œcfread 1 1
      ]refcheck
 3 516
 1 172
 0   0

So perhaps, the issue is just the new Colossal File system??  To more testing...

 

 

     

brent.hildebrand
Damaged workspaces
Posted: Monday, May 31, 2004 8:34 PM (EST)

Simplest example, in version 5.2.01  May 13 2004 11:25:37  Win/32

 

CLEAR WS
      ]refcheck
 0 0
 0 0
 0 0
      'test.it' Œcfcreate 1
1
      ]refcheck
 0 0
 0 0
 0 0
      'Test' Œcfappend 1
1
      ]refcheck
 0 0
 0 0
 0 0
      Œcfread 1 1
Test
      ]refcheck
 1 28
 0  0
 0  0

 

Hope that helps.

     

davin.church
Damaged workspaces
Posted: Monday, May 31, 2004 8:44 PM (EST)

The problem isn't just the new file system, as it's been happening for some years.  (I even found a couple of damaged WSs on the 3.0(?) distribution disk.)  A few months ago my brother found another way to reproduce it by using the Code Walker in a trivial situation.  (That was reported upwards at the time and hopefully it's been helping them track it down.)

 

In your testing method, don't forget that you're also invoking all the UCMDS code every time you do that, so you might want to eliminate that from your test-confirmation procedure to make sure that's not contributing to the problem.  For myself, I wrote a little one-liner in a function that I could copy around that tests the result and []ERRORs if it's non-zero.  I also imbed it at the beginning and end of my development code (disabled for production) so that all my major workspaces constantly check themselves for re-infection before and after each test run (to try to narrow down the conditions where it occurs).

     

davin.church
Damaged workspaces
Posted: Monday, May 31, 2004 8:50 PM (EST)
That looks awfully suspicious!  I hope they can use that to track down this flavor of damage pretty easily.  (But as I just mentioned, that's not the only way it can happen.)

     

brent.hildebrand
Damaged workspaces
Posted: Monday, May 31, 2004 9:54 PM (EST)

Ah - and here is another with a pointer (pun intended) from Davin - if you are using codewalker, and have a suspended function, pressing F10 to execute the next bit of code, and check the RefcountsS, it shows numbers in the second row, meaning potential orphans.  If you save the workspace in this state, then copy the workspace later into a clear workspace, row 1 will show that there are now orphans.  Easily reproducible.

 

While on the subject of Codewalker, do not put a primary stop on a :CASE statement.  Pressing F10 when the function is suspended with the first stop on a :CASE statement will cause APL to crash everyting.  And I hate it when that happens. 

 

 

     

j.merrill
Damaged workspaces
Posted: Monday, May 31, 2004 10:24 PM (EST)
It should indeed help APL2000 fix this. I believe "that's why it's a beta release" applies here!

     

Mark.Osborne
Damaged workspaces
Posted: Tuesday, June 01, 2004 8:53 AM (EST)

Davin and Brent both correctly point out that orphans and bad refcounts are not restricted to the new component file system.  But there was a major hole in the file system in this respect that Brent noted.  This hole was fixed, along with a related bug on 19 and 20 May, but this version of the code has not yet been distributed in a build.  Here are the Change Lists for those changes:


 

19 May 2004 - Change List 2883
General cleanup and two bug fixes in CompFileIF.cpp

Remove some dead code.

Remove some redundant code.

Clean up some error checking code.

Add some new error checks.

Improved error messages.

Fix bug - CFREAD left incorrect reference counts in new mentries.

Fix bug - CFREAD did not properly clean up excess space after reading components.  There is potentially a better approach here which may be implemented next - that would avoid reading the excess in the first place.

 


20 May 2004 - Change List 2896
Re-work CFREAD internally.  These changes eliminate reading any slop in the component and subsequently discarding said slop in the WS.  In addition, since the space use in the WS has been cleaned up in this and previous code, CFRDCI was rewritten to return the actual size consumed by a simple array, rather than the space required to read it into the WS (the two are now the same).  While in there, CFRDCI was made to support component 0 as well.

Change CFHIST to return file creation time values for a non-existing component 0.

Enable a gracious exit with error if SPRING() returns an error indication (this is in reading a nested array).

 

     

Andrew.Brown
Damaged workspaces
Posted: Wednesday, April 13, 2005 9:19 AM (EST)

Running it Œit 'AuditRefcountsS'  or Œit 'AuditRefcountsC' causes a system error.

 

Any other suggestions as my wssize is now  over 16mb, it started out at 891kb

     

Support
Damaged workspaces
Posted: Wednesday, April 13, 2005 9:38 AM (EST)

Try resaving your workspace in your prior version of APL+Win after having first copied the workspace instead of LOAD or XLOAD.  Then load the workspace in version 5.0.

If that doesnt' work, another option is the ]OUT and ]IN user commands.  Use ]OUT in your prior version of APL+Win and ]IN in version 5.0.  Help is available for both user commands by invoking either command with a ? as the argument.

APL2000 Support

P.S.  What is the version number of your 5.0 system?

     

davin.church
Damaged workspaces
Posted: Wednesday, April 13, 2005 12:04 PM (EST)
Copying the workspace with APLv4 is how I used to do it. But I'm worried about the system crashes now - I don't like having my repair tools break on me.

     

Support
Damaged workspaces
Posted: Wednesday, April 13, 2005 3:13 PM (EST)

It's really too soon to worry Davin.  The problem reported by Andrew is the only the reported case.

APL2000 Support

     

davin.church
Damaged workspaces
Posted: Wednesday, April 13, 2005 3:26 PM (EST)
Yeah... I got to thinking it might be because he has so many orphans that maybe it overflowed an integer accumulator or something. But you're right - it's probably an isolated case.

     

Andrew.Brown
Damaged workspaces
Posted: Wednesday, April 13, 2005 6:27 PM (EST)

You do not appear to have received my last response.

 

I have solved the problem by saving all the function and variables to a file and then restoring them in a clear workspace.

It now only takes 600kb instead of 16mb!!

 

Anyone wants the function (28 lines) let me know.

andrew@cygnus-enterprises.ltd.uk

 

     



APL2000 Official Web Site

To err is human, but when the eraser wears out ahead of the pencil, you're overdoing it.
- Josh Jenkins

APLDN Home   |    |  Events   |  Trainings   |  APL Books   |  APLDN Links   |    |  Discussion Groups   |    |  Downloads   |  Articles   |  Library   |  Learning Tools   |  APLDN User IO   |  APL2000.com   |