fredreload Posted September 1, 2018 Posted September 1, 2018 So I took an interest in brain neurons synapses computing. As you know there are billions of neurons and trillion of synapses in a person's body. To simulate the nerve impulse running in these synapses I turn to multiple GPUs as I believe it has enough power to simulate the brain better than CPUs like a Titan GPU Supercomputer compares to other CPU based supercomputer. But now to do an analysis on the neurons and synapses I will have to get down to track the neurons and nerve signals molecules(Na+,k+). To track these things require a 3D structural identification technique. For instance I leave a bag inside a building and I scan the entire build and store the points, then I use 3D graphics analysis to try and find this bag inside a building. This task, I am not sure if it can be done using GPU or CPU as it is computationally intensive. I am not even sure if the current levels of hardware would permit something like this.
Frank Posted September 1, 2018 Posted September 1, 2018 Maybe a sparse octree for 3d representation, but I don't know if GPU libraries are easily available. Maybe Nvidia?
Strange Posted September 1, 2018 Posted September 1, 2018 You might find existing libraries to do most of what you need: http://www.iwocl.org/resources/opencl-libraries-and-toolkits/ And there are several neural network frameworks as well: https://www.google.com/search?q=opencl+neural+networks
fredreload Posted September 2, 2018 Author Posted September 2, 2018 17 hours ago, Frank said: Maybe a sparse octree for 3d representation, but I don't know if GPU libraries are easily available. Maybe Nvidia? Hi, yes, I am looking for a 3D object tracking system based on GPU, I'll take a look at sparse octree, thanks
Sensei Posted September 2, 2018 Posted September 2, 2018 Binary-Tree, KD-Tree, Quad-Tree, Octree, should be implemented by programmer, who is enough knowledgeable (it's basics), and is programming it for real rather than just talking on forum... For internal organs imaging there are often used voxels (3D version of pixels in 2D).
fredreload Posted September 2, 2018 Author Posted September 2, 2018 (edited) 18 minutes ago, Sensei said: Binary-Tree, KD-Tree, Quad-Tree, Octree, should be implemented by programmer, who is enough knowledgeable (it's basics), and is programming it for real rather than just talking on forum... For internal organs imaging there are often used voxels (3D version of pixels in 2D). Yes I know if I actually want to implement this I would follow the steps in post 1. The question I have is what is the difference between CPU and GPU. For instance, from the youtube video I posted, could there be a CPU tensorflow instead? And how would a CPU tensorflow differ from a GPU tensorflow. Apparently to my knowledge the GPU tensorflow would be more efficient(faster) because of its clock speed because a GPU is pretty much consisted of multiple clocks. So for the sparse octree is it also CPU powered or GPU powered and what is the difference? P.S Ya well 3D voxels are awesome Edited September 2, 2018 by fredreload
Sensei Posted September 2, 2018 Posted September 2, 2018 (edited) 56 minutes ago, fredreload said: The question I have is what is the difference between CPU and GPU. Show me your OpenCL code to analyze... or whatever else are you using... CUDA? 56 minutes ago, fredreload said: Apparently to my knowledge the GPU tensorflow would be more efficient(faster) because of its clock speed because a GPU is pretty much consisted of multiple clocks. GPU has e.g. 1024 cores, running at pretty low frequency each, while CPU has very few cores/hyper-threads with e.g. 4-6 cores (8-12 hyper-threaded) (quite modern CPU @ 2018) The thing is, you never wrote the real code, for GPU.. so you have no knowledge.. Write code, and you will gain knowledge. Simple as that. Easier than writing post on forum.. Edited September 2, 2018 by Sensei
Strange Posted September 2, 2018 Posted September 2, 2018 1 hour ago, Sensei said: Binary-Tree, KD-Tree, Quad-Tree, Octree, should be implemented by programmer, who is enough knowledgeable (it's basics), and is programming it for real rather than just talking on forum... So you are saying people should no ask for advice? 1 hour ago, Sensei said: Show me your OpenCL code to analyze... or whatever else are you using... CUDA? If someone is looking for advice for choosing between using a GPU and a CPU for an application, then that probably implies they haven't written any code yet. 1 hour ago, Sensei said: The thing is, you never wrote the real code, for GPU.. so you have no knowledge.. Write code, and you will gain knowledge. Simple as that. Easier than writing post on forum.. So people can't ask for advice from people who may have the relevant experience. Don't be so arrogant?
Sensei Posted September 2, 2018 Posted September 2, 2018 (edited) 29 minutes ago, Strange said: So you are saying people should no ask for advice? No. People should ask for advice, if they are really stuck. But fredreload asks for advice, and then he is NOT using them in the real world.. and it's repeated over and over and over again in many different threads.. 29 minutes ago, Strange said: If someone is looking for advice for choosing between using a GPU and a CPU for an application, then that probably implies they haven't written any code yet. The thing is he is not seeking for advice.. Advice could be given and then used by somebody really seeking for advice.. but he is not using any advices given here on this forum.. Somebody really seeking for advice, would write the same piece of code for multi-threaded CPU, then for multi-core GPU in OpenCL/CUDA, benchmark them on a few CPUs/GPUs, and learn what is benefit (or not) from experimental data gained by himself/herself, and share results on forum of his/her findings. Every gfxcard has different number of cores.. his CPU can have 1-6 the real cores, and 0-12 hyper-threads.. It's impossible to give him definite answers, what will work the best for him, on his machine. But he is not going to check it, will not write any code to check it... Questions are purely theoretical.. I would prefer talking with somebody who is using my advices in the real world, implementing them, when they're revealed. And then we can check whether they were good (or not) advices.. Edited September 2, 2018 by Sensei
Strange Posted September 2, 2018 Posted September 2, 2018 19 minutes ago, Sensei said: Somebody really seeking for advice, would write the same piece of code for multi-threaded CPU, then for multi-core GPU in OpenCL/CUDA, benchmark them on a few CPUs/GPUs, and learn what is benefit (or not) from experimental data gained by himself/herself, and share results on forum of his/her findings. I think you should join Stack Exchange, you would fit right in with their elitist attitude of only wanting the “right sort of question”
Sensei Posted September 2, 2018 Posted September 2, 2018 (edited) Elitist attitude? Thx, for showing complete lack of understanding.. It's not about asking "the right sort of questions", but what is done with answers given for already existing one (from the past experience).. Edited September 2, 2018 by Sensei
fredreload Posted September 2, 2018 Author Posted September 2, 2018 (edited) 2 hours ago, Sensei said: The thing is he is not seeking for advice.. Advice could be given and then used by somebody really seeking for advice.. but he is not using any advices given here on this forum.. Somebody really seeking for advice, would write the same piece of code for multi-threaded CPU, then for multi-core GPU in OpenCL/CUDA, benchmark them on a few CPUs/GPUs, and learn what is benefit (or not) from experimental data gained by himself/herself, and share results on forum of his/her findings. Ya, it's my fault, as you've clearly stated what I should be doing. Well, doesn't hurt to give someone an "answer" if you already "know" "who is faster" in a comparison between "CPU" and "GPU" running the "tensorflow" :D. Just because I can bake toast does not mean I need to build a toast machine = =. Just because I am an American does not mean I have to relive the May Flower incident. And I would then work on it if "there is no available answer". But well, I know people with a hands on attitude on things and I respect that. You and Strange should chill. If you have an absolute reason of why I must work on it before obtaining an answer let me know, people complained to me about this before, but I could not see the reason D:. Doesn't mean I am correct, I am open to suggestions as you clearly know the question I am asking. Well for the previous questions that I made on this forum is mostly hypotehtical. I have limited resources and budgets to spend on things D:, just an enthusiast on science, maybe I should have started a company sigh Edited September 2, 2018 by fredreload
Sensei Posted September 2, 2018 Posted September 2, 2018 I am sorry for maybe too harsh my previous posts..
fredreload Posted September 3, 2018 Author Posted September 3, 2018 (edited) 12 hours ago, Sensei said: I am sorry for maybe too harsh my previous posts.. Na, no one has complained to me about being too harsh, if it's something I need to learn myself then it is the right answer P.S Unless you have an answer @@ Edited September 3, 2018 by fredreload
Sensei Posted September 3, 2018 Posted September 3, 2018 (edited) 3 hours ago, fredreload said: if it's something I need to learn myself then it is the right answer Good start would be searching net for "cuda programming tutorial". e.g. https://www.tutorialspoint.com/cuda/cuda_quick_guide.htm There is comparison table which year of production of CPU has what GFLOPS versus GPU GFLOPS. But the real results you would get after writing equivalent code for both CPU and GPU, and benchmarking them.. There are algorithms which are easy to make them working parallel (to run on multiple CPU cores, CPU threads, or GPU cores), and there are certain algorithms which are very hard, or impossible, to convert to parallel (mostly because result from previous calculation is used in further calculation, one depending on other). Algorithm which is impossible to make it working parallel will obviously work the best on CPU which can have 2.0... 3.0 GHz per core (the larger frequency the better, the less cycles of CPU per instruction the better), because GPU core runs at much smaller frequency each (but there can be thousands of them!) than CPU.. Edited September 3, 2018 by Sensei
fredreload Posted September 3, 2018 Author Posted September 3, 2018 (edited) 10 minutes ago, Sensei said: Good start would be searching net for "cuda programming tutorial". e.g. https://www.tutorialspoint.com/cuda/cuda_quick_guide.htm There is comparison table which year of production of CPU has what GFLOPS versus GPU GFLOPS. But the real results you would get after writing equivalent code for both CPU and GPU, and benchmarking them.. There are algorithms which are easy to make them working parallel (to run on multiple CPU cores, CPU threads, or GPU cores), and there are certain algorithms which are very hard, or impossible, to convert to parallel (mostly because result from previous calculation is used in further calculation, one depending on other). Hmm, there is this video I saw comparing CPU and GPU rendering on a 3D structure in Korean, I think it was a brain structure where the user could switch from GPU rendering to CPU rendering which was programmed in c# visual studio somewhere in this forum and I think GPU is much faster. I know I've read the GFLOP generated by each but I don't think they account for the big picture. I'll have to do some digging around to find that post. My bad I might have gotten an answer before = = Edited September 3, 2018 by fredreload
Sensei Posted September 3, 2018 Posted September 3, 2018 (edited) 23 minutes ago, fredreload said: Hmm, there is this video I saw comparing CPU and GPU rendering on a 3D structure in Korean, I think it was a brain structure where the user could switch from GPU rendering to CPU rendering which was programmed in c# visual studio somewhere in this forum and I think GPU is much faster 3D visualization is the main job of modern gfx card. That's what for they were designed. But I am talking about general usage of GPU for things outside of 2D/3D graphics area, for heavy math calculations of floating point numbers and integers (GFLOPS = Giga FLoating point Operations Per Second). Drawing triangle is very easy procedure, and easily can be implemented in hardware. Comparison of various gfx cards is in that case done in quantity of triangles per second ("millions/billions of triangles per second"). Edited September 3, 2018 by Sensei
fredreload Posted September 3, 2018 Author Posted September 3, 2018 (edited) 13 minutes ago, Sensei said: 3D visualization is the main job of modern gfx card. That's what for they were designed. But I am talking about general usage of GPUs for things outside of 2D/3D graphics area, for heavy math calculations of floating point numbers and integers (GFLOPS = Giga FLoating point Operations Per Second). Drawing triangle is very easy procedure, and easily can be implemented in hardware. Comparison of various gfx card is in that case done in quantity of triangles per second ("millions/billions of triangles per second"). Well, I am not sure what calculations are involved in a tensorflow calculation, but from a Chinese site I found in comparing CPU tensorflow and GPU tensorflow. GPU tensorflow is indeed faster. So I am curious as to know what type of calculations, like tensorflow, could be calculated millions/billions per second. I mean doesn't all repeated calculations could be used with GPU thanks to its parallel computing architecture? Edited September 3, 2018 by fredreload
Sensei Posted September 3, 2018 Posted September 3, 2018 (edited) 31 minutes ago, fredreload said: So I am curious as to know what type of calculations, Suppose so you have calculation: a = ( b + c ) * d b+c must be added first (suppose so e=b+c) and later a = e * d It relies on result of addition! Such operation can't be done parallel. But if you have millions of such operations, you can split to first do the all b+c in loop e.g. for( int i = 0; i < max; i++ ) { e[i] = a[i] + b[i]; } and then for( int i = 0; i < max; i++ ) { a[i] = e[i] * d[i]; } or alternatively: for( int j = 0; j < thread_max; j++ ) { for( int i = j; i < max; i += thread_max ) { a[ i ] = ( b[ i ] + c[ i ] ) * d[ i ]; } } There are much harder to make working parallel calculations or operations. There is needed to know entire algorithm to judge whether it's possible, or not possible, to make it working parallel. CPU is splitting job (like office manager in company), and telling each GPU core what chunk of data it has to process. And when job is done, integrates data from various GPU cores together, or orders them to do the final integration. Edited September 3, 2018 by Sensei
fredreload Posted September 3, 2018 Author Posted September 3, 2018 5 minutes ago, Sensei said: Suppose so you have calculation: a = ( b + c ) * d b+c must be added first (suppose so e=b+c) and later a = e * d It relies on result of addition! Such operation can't be done parallel. But if you have millions of such operations, you can split to first do the all b+c in loop e.g. for( int i = 0; i < max; i++ ) { e[i] = a[i] + b[i]; } and then for( int i = 0; i < max; i++ ) { a[i] = e[i] * d[i]; } or alternatively: for( int j = 0; j < thread_max; j++ ) { for( int i = j; i < max; i += thread_max ) { a[ i ] = ( b[ i ] + c[ i ] ) * d[ i ]; } } There are much harder to make working parallel calculations or operations. There is needed to know entire algorithm to judge whether it's possible, or not possible, to make it working parallel. In this case, what makes the tensorflow a desirable subject for GPU computation?
Frank Posted September 3, 2018 Posted September 3, 2018 On 9/2/2018 at 4:15 AM, fredreload said: So for the sparse octree is it also CPU powered or GPU powered and what is the difference? One consideration is that GPU tends to be memory space limited due to expense of fast RAM. So very large structures might be better on the CPU side or CPU side breaks down the size of data needed to something manageable by the GPU side(s). in cryptocurrency mining, multiple GPU boards can run on a single system, so some fast parallel processing can then be done.
fredreload Posted September 3, 2018 Author Posted September 3, 2018 8 minutes ago, Frank said: One consideration is that GPU tends to be memory space limited due to expense of fast RAM. So very large structures might be better on the CPU side or CPU side breaks down the size of data needed to something manageable by the GPU side(s). in cryptocurrency mining, multiple GPU boards can run on a single system, so some fast parallel processing can then be done. I know working with a brain with 100 billion neurons is not easy lol. But the training image would involve potassium, sodium ions, just like the cards. What this brain runs on in real time is GPU, but the training part in recognizing the potassium, and sodium ions(as the cards) would be based on GPU. So once you bring the 3D brain mesh in view, it would automatically locate all potassium and sodium ions within a brain(tracking in real time). Sounds simple but, just a speculation.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now