PC SOFT

FORUMS PROFESSIONNELS
WINDEVWEBDEV et WINDEV Mobile

Accueil → WINDEV 2024 → Trying to solve a memory leak issue
Trying to solve a memory leak issue
Débuté par Darryl MEGANOSKI, 14 mai 2018 20:03 - 4 réponses
Membre enregistré
5 messages
Posté le 14 mai 2018 - 20:03
Hello all, if anyone could give me any hints on how to solve the memory leak issue I am having it would be much appreciated. I will try to describe my scenario as best as possible. I am a web developer mostly, so things like memory management I am not very familiar with.

I am tasked with scraping data from a website (employer insisted with Windev). I have created two classes to handle the parsing of HTML and the DOM queries.

With threads, I have managed to get the application to process a little over 20 pages per second. The problem is that the application continues to build used memory and even after it has completed, the memory is not freed.

Here are the classes:

stAttribute is Structure
sName is string
sValue is string
END
HTMLElement is a Class
m_nId is int // internal unique numerical ID for this element
m_sTag is string // the html tag
m_sContent is string // the contents, formatted to have child elements replaced with <ID>
m_arrParents is array of int // array of numeric IDs of heiarchial parent elements
m_arrChildren is array of int // array of IDs of direct decendant elements
m_arrAllChildren is array of int // array of ALL decendant elements
m_arrAttributes is array of stAttribute dynamic // array of structure representing attributes
m_pclDocument is HTMLDocument dynamic // reference to document class containing array of other elements
END
HTMLDocument is a Class
m_sOriginal is string // raw html, for debugging
m_arrElements is array of HTMLElement dynamic // flat array of classes representing all of the elements
m_nElementInt is int = 1 // iternal incremental numeric ID to represent each element
m_arrSelfClose is array of string // array of html tags that are self-closing
END

I have taken care to ensure that every class declaration is dynamic.
I only declare variables at the beginning of the functions, as to avoid the issue with loops.
I am manually deleting the HTMLDocument class at the end of my function, I would assume that would also delete all the HTMLElements contained within it. I had tried also deleting them in the destructor function, but it did not seem to make a difference.

Now, the Document class contains an array of Elements, and the element class contains a reference back to the Document. Could that be a potential issue?
Posté le 14 mai 2018 - 23:55
Hi Darryl,

> I have taken care to ensure that every class declaration is dynamic. I

That cannot be correct. A class declaration is not dynamic or static,
however you may be talking about instantiating an object of the class as
dynamic. If that is the case, that point is OK, but your description is not.

only declare variables at the beginning of the functions, as to avoid
the issue with loops. I am manually deleting the HTMLDocument class at
the end of my function, I would assume that would also delete all the
HTMLElements contained within it. I had tried also deleting them in the


You would be wrong. Dynamic objects are deleted BY CODE or when the
program ends (not the function). So if you are not EXPLICITELY deleting
them AT THE RIGHT TIME (and without seeing your code, it's impossible to
be precise), you have found your leak.

Best regards


--
Fabrice Harari
International WinDev, WebDev and WinDev mobile Consulting

Free Video Courses, free WXShowroom.com, open source WXReplication, open
source WXEDM.

More information on http://www.fabriceharari.com

Contact me at:
Email: fabrice@fabriceharari.com
Skype ID: fabriceharari
Telegram ID:@fabriceharari
Tel # in the USA: +1 985 746 1422
Tel # in France: +33 970 444 445 (local number 0970 444 445)


destructor function, but it did not seem to make a difference.



Now, the Document class contains an array of Elements, and the element
class contains a reference back to the Document. Could that be a
potential issue?
Membre enregistré
5 messages
Posté le 15 mai 2018 - 16:23
Yes, I meant the variables that hold the classes are typed as dynamic when I create instances of them. Sorry for the confusion. I am not very good at communicating such things for this reason.

I did not know that dynamic objects were expected to be manually deleted. Good to know, thank you. There may be some instances that I am missing the removal of dynamic variables, then.

Since my project's task has a lot of intricacies, I decided to stop trying to debug the long process and to do some more 'controlled' tests, having the classes load the same generic HTML template a set number of times and then try cleaning up the memory.
Membre enregistré
5 messages
Posté le 15 mai 2018 - 19:53
Okay, So I'm getting the weirdest behavior from tests now, and I hope someone can shed some light on why this would be the case.

I setup one button in the UI which, when clicked, executes a thread which instances a single document class (generic html template) and then deletes it.

The funny thing is that the amount of memory left behind seems to be related to how quickly I execute the clicks of the button. Taking 3 seconds or so between clicks results in the application gaining between 4kb and 20kb per instance. If I click the button fast, 10 clicks results in over 1000kb gain. Which is over 100kb per instance. Why would that happen?
Membre enregistré
5 messages
Posté le 18 mai 2018 - 20:33
For anyone interested now or in the future, it seems to be more related to the execution of threads rather than the classes.

I have been in contact with support and doing some tests. Without using threads, it takes forever to process even one document, but the memory does not increase much. Using one thread per 20 results takes about 1 second per document. Slight memory issue. Using one thread per one document results in processing around 20 results per second, with the memory building up very quickly (more memory remaining per document). It seems the memory used is not freed the same way in threads. Still working on a solution.