Aurora Colorado Free Consumer Credit Counseling Service call (800) 254-4100, Credit Repair, Fix Bad Credit, Remove Negative Remarks: Foreclosure, Delinquent Student Loan, Wage Garnishment, Vehicle Repossession, and Bankruptcy. Consumer credit counseling services provide resources to help clear your credit, increase your social credit score, and solve your money problems.
Tuesday, March 7, 2023
Cambridge Massachusetts Consumer Credit Counseling Service | (800) 254-4100: Digest for March 08, 2023
Cambridge Massachusetts Free Consumer Credit Repair Counseling Service call (800) 254-4100, Fix Bad Credit, Increase Your Credit Score, Repair Credit Report, Remove Negative Remarks: Foreclosure, Delinquent Student Loan, Wage Garnishment, Vehicle Repossession, and Bankruptcy. Consumer credit repair counseling service provide resources to help clear your credit, increase your social credit score, solve money problems.
by Houston Texas Appliance Parts on Monday 06 March 2023 02:21 PM UTC-05
Automotive operating systems (OSes) are becoming increasingly popular in the industry. OEMs are announcing their own versions, such as VW.OS and MB.OS, and multiple open-source initiatives are defining the software-defined vehicle. However, there is neither a widely accepted definition of an automotive OS nor a consensus on the functionality that it provides or the concepts that it implements. It is not an operating system in the traditional sense.
In the automotive industry, software is claiming an increasingly large portion of value creation. A 2021 study by Berylls projected that the software market in automotive will triple by 2030. While this comes along with a significant increase in software complexity, the productivity gains cannot keep pace. This growing complexity puts high pressure on development cost and, more importantly, on development capacities. It reduces the ability to quickly innovate and iterate. EV startups and other new entrants are putting even more pressure on innovation cycles—with Tesla serving as the industry's role model for rolling out new software functionality to vehicles that have already been sold.
To address this productivity issue, changes in architecture—in the way that software is sourced, built and maintained—is required. This is where the automotive OS comes into play.
What is the automotive OS?
An automotive OS is a software platform that abstracts the complex vehicle network of electronic control units (ECUs) as one device and then manages, supervises, and updates this device.
Applications and functions are built against the APIs of the automotive OS to ensure maximum portability and maintainability and form the ecosystem of the automotive OS.
Seeing vehicles as devices
Traditionally, software was a part of an ECU that was tailored to meet a specific set of functionalities for a vehicle. However, as the relevance of software grew, common software parts of these ECUs were standardized to allow for reuse and harmonization of system concepts and semantics. Standards, such as OSEK and AUTOSAR, emerged to allow for higher degrees of software integration and reuse across different OEMs.
Despite this standardization, each ECU is still sourced and built individually. As a result, there is a variety of middleware implementations and concepts used across different function domains within a single vehicle. This concept becomes limiting as the degree of software integration, interdependencies among functions and number of updates increase. To address these issues, architectural concepts, such as service-oriented communication and virtualization technology, were introduced. However, the required productivity gains have yet to be achieved.
The automotive OS aims to eliminate variants of middleware technology across different ECUs, i.e., to use the same implementation and harmonize system concepts and semantics on a vehicle level, resulting in a harmonized vehicle abstraction for a given vehicle platform.
The fourth value proposition is not as visible in the automotive software market yet. It is about separating the life cycles of the vehicle platform, the software platform and the function itself. This is the crucial point to bring an automotive OS to success.
Defining layers
To serve the value proposition, the automotive OS must go beyond established middleware technology. The automotive OS abstracts the complex network of ECUs as one device. It manages, supervises and updates this device.
The high-level architecture of the automotive OS includes four layers: the core software layer, the middleware layer, the platform services layer and the applications layer. The core software layer includes hardware-dependent software, such as operating systems and virtualization technology; the middleware layer manages application software and its life cycle on an ECU or partition; the platform services layer brings the control plane of the software platform to a vehicle-wide level; and the applications layer is where applications are executed.
Harmonizing these layers for a complete vehicle platform provides significant advantages in terms of development efficiency, software updates and software maintenance. The key is to eliminate variants of middleware technology across different ECUs, to eliminate domain-specific variants and to harmonize system concepts and semantics on a vehicle level. This on-board software is complemented by a cloud-based CI/CD and simulation and validation framework to enable scalability of software development processes across stakeholders.
The software platform ecosystem challenge
As mentioned earlier, the automotive OS's fourth value proposition is about separating the life cycles of the vehicle platform, the software platform and the function itself. This means upgrading the software platform of a device that has already been sold, which enables new revenue streams by selling new software functions to owners of existing devices. When that software platform is upgraded, time and resources no longer have to be dedicated to maintaining older versions of software. This decoupling of the life cycle of the software platform from that of the hardware platform also reduces the number of software platforms to concurrently maintain.
Additionally, decoupling the life cycle of software applications from that of the software platform permits the integrator to update the software platform without dropping compatibility to existing applications, creating an ecosystem of apps and reducing maintenance efforts. This concept is essential for creating an ecosystem of functions.
Bottom line
Software is becoming the defining factor for innovation in the automotive industry. With this, software complexity is rising and crippling innovation speeds and development productivity.
The automotive OS is the industry's best bet on getting this complexity to be manageable. It aims to harmonize software across the vehicle, to reduce the notoriously high number of variants, to harmonize development practices and interfaces and to make maintenance tractable.
In the long run, the promise of the automotive OS is to create an ecosystem of functions that can be developed and maintained independently of the underlying vehicle, such that innovation cycles of hardware and software can be uncoupled.
by Cambridge Massachusetts Consumer Credit Counseling Service | (800) 254-4100. on Tuesday 07 March 2023 05:40 AM UTC+00 | Tags: #cambridgeconsumercreditcounselingservice cambridge-massachusetts-consumer-credit-counseling-service
by Cambridge Massachusetts Consumer Credit Counseling Service | (800) 254-4100. on Tuesday 07 March 2023 07:40 AM UTC+00 | Tags: #cambridgeconsumercreditcounselingservice cambridge-massachusetts-consumer-credit-counseling-service
by noreply@blogger.com (Loni Cardon) on Tuesday 07 March 2023 12:34 AM UTC-05
Obituary of Anne Kay ... Please share a memory of Anne to include in a keepsake book for family and friends. ... Anne Kay, 90, of Marshfield, passed away ...
by Cambridge Massachusetts Consumer Credit Counseling Service | (800) 254-4100. on Tuesday 07 March 2023 10:40 AM UTC+00 | Tags: #cambridgeconsumercreditcounselingservice cambridge-massachusetts-consumer-credit-counseling-service
by Houston Texas Appliance Parts on Tuesday 07 March 2023 04:21 AM UTC-05
LED lighting casts a bright, white light and uses less energy than incandescent bulbs. A reversible door on Maytag® upright freezers gives you the ... PennsylvaniaPhiladelphia PAPhiladelphia March 07, 2023 at 12:02AM
by Houston Texas Appliance Parts on Tuesday 07 March 2023 08:21 AM UTC-05
In previous articles, we established the many waysFPGAs surpassother AI chipsets for running machine learning algorithms at the edge in terms of reconfigurability, power consumption, size, speed, andcost. Moreover, how the microarchitecture-agnostic RISC-V instruction set architecture (ISA) marries up with thearchitectural flexibility of the FPGAseamlessly. However, the apparent lack of mid-range, cost-effective FPGAs and their less-than-straightforward design flow are a major bottleneck — the software skills required for the fully custom hardware description language (HDL) implementation are difficult to find and often come with a steep learning curve.
Efinix fills the gap with FPGAs built on the innovative quantum compute fabric made up of reconfigurable tiles known as exchangeable logic and routing (XLR) cells that function as either logic or routing, rethinking the traditional fixed ratio of logic elements (LEs) and routing resources. This allows for a high-density fabric in a small device package where no part of the FPGA is underutilized. The potential of this platform transcends the typical barriers facing edge-based devices today: power consumption, latency, cost, size, and ease of development.
Possibly the most striking feature of Efinix FPGAs is the ecosystem and state-of-the-art tool flow surrounding it that lowers development barriers, allowing designers to readily implement AI at the edge using the same silicon — from prototype to production. Efinix has embraced the RISC-V, thereby allowing users to create applications and algorithms in software — capitalizing on the ease of programmability of this ISA without being bound to proprietary IP cores such as ARM. Since this is all done with flexible FPGA fabric, users can massively accelerate in hardware. Efinix offers support for both low level and more complex custom instruction acceleration. Some of these techniques include the TinyML accelerator and predefined hardware accelerator sockets. With approaches such as these, the leaps in acceleration accomplished delivers hardware performance while retaining a software-defined model that can be iterated and refined without the need to learn VHDL. This results in blazing-fast speeds for edge devices, all while consuming low power and functioning within a small footprint. This article discusses precisely how the Efinix platform simplifies the entire design and development cycle, allowing users to take advantage of the flexible FPGA fabric for a scalable embedded processing solution.
Barriers at the edge — a dam-blocking progress
From massive wireless sensor networks to streaming a high-resolution 360oimmersive AR or VR experience, most of the world's data lies at the edge. Disaggregating the compute burden from the cloud and bringing it closer to the devices opens doors for next-generation, bandwidth-hungry, ultra-low-latency applications in autonomous driving, immersive digital experiences, autonomous industrial facilities, telesurgery, and so on. The use cases are endless once the enormous roadblock of transmitting data to and from the cloud is sidestepped.
However, the very defining factors of low-latency, power-hungry compute at the edge are the very same factors that pose a significant design challenge for these small but prolific power-limited devices. How then is it possible to design a device capable of processing the power-hungry relevant ML algorithms without having to invest in elaborate technologies? The solution has been to implement any hardware deemed sufficient to run the suitable applications and algorithms (e.g., CPU, GPU, ASIC, FPGA, ASSP) while accelerating the more compute-intensive tasks to balance the compute time (latency) and resources used (power consumed).
As with any innovation, the landscape of deep learning is continually shifting with updating models and optimization techniques, necessitating the use of more agile hardware platforms that can change almost as rapidly as the programs running on them with little to no risk. The parallel processing and flexibility/reconfigurability of FPGAs seem to line up seamlessly with this need. However, making these devices available for mainstream, high-volume applications requires lowering the design barriers for configuring and accelerating the FPGA fabric — a time-consuming process that normally requires a high degree of expertise. Furthermore, traditional accelerators are typically not granular enough and incorporate large pieces of a model that typically do not scale well. They also generally consume far too much power and are, more often than not, proprietary — causing engineers to relearn how to use the vendor-specific platform.
The Sapphire RISC-V core
Creating an application on the RISC-V Core in C/C++
Efinix squarely addresses all of these potential obstacles by taking on the challenge of making FPGAs available to the AI/ML community in an intuitive way. The RISC-V Sapphire core is fully user configurable through the Efinity GUI; this way, users do not have to know all the VHDL behind implementing the RISC-V in the FPGA and can exploit the straightforward programmability of common software languages (e.g., C/C++). This allows teams to rapidly generate applications and algorithms in software at speed. All the required peripherals and buses can be specified, configured, and instantiated alongside the Sapphire core to deliver a fully configured SoC (Figure 1). This RISC-V capability includes multi-core (up to four cores) support and Linux capability, delivering a high-performance processor cluster to a designer's FPGA application as well as the ability to run applications directly on the Linux operating system. The next step — hardware acceleration — is greatly simplified with hardware-software partitioning; once a designer has perfected their algorithm in software, they can progressively start to accelerate this within the flexible Efinix FPGA fabric. However, before we move on to the next step of hardware acceleration, it would be important to understand the inherent benefits of the RISC-V architecture and how it can be exploited for use within the FPGA fabric.
Custom-instruction-capable RISC-V
The RISC-V architecture is unique in that it does not have all of its instructions defined; instead, there are a few instructions left open for the designer to define and implement. In other words, a custom arithmetic logic unit (ALU) can be created, and it will perform whatever arbitrary function when called upon by the custom instruction (Figure 2). These custom instructions will have the same architecture as the rest of the instructions (e.g., two registers in, one register out) granting a total of eight bytes of data to work with and four bytes that can be passed back to the RISC-V.
However, since the ALU is built within the FPGA, it can both access and pull data from the FPGA. This allows users to expand beyond the eight bytes of data and make the ALU arbitrarily complex — giving access to data that was put out on the FPGA previously (e.g., access to data from sensors). The ability to have an arbitrarily complex ALU is a multiplying factor for speed when it comes to hardware acceleration. Efinix has taken this ability of the custom instruction and adapted it for the AI and ML communities with the TinyML platform.
The TinyML platform — a library of custom instructions
Hardware acceleration with the TinyML platform
The TinyML platform streamlines the process of hardware acceleration where Efinix has taken the compute primitives used in TensorFlow Lite models and created custom instructions to optimize their execution on accelerators in the FPGA fabric (Figure 3). Through this, the standard software-defined models of TensorFlow are absorbed into the RISC-V complex and are accelerated to run at hardware speed, taking advantage of the rich, open-source TensorFlow Lite community. The entire development flow has been streamlined using the popular Ashling tool flow to make setup, application creation, and debugging a simple and intuitive process.
Many of the TinyML platform's libraries of custom instructions are all available to the open-source community on theEfinix GitHubfor free access to the Efinix Sapphire core and everything that is needed to design and develop highly accelerated edge AI applications.
Accelerations strategies: an overview
The combination of the RISC-V core, the Efinix FPGA fabric, and the rich, open-source TensorFlow community allows for creative acceleration strategies that can be broken down into several steps (Figure 4):
Step 1: Run the TensorFlow Lite model using the Efinity RISC-V IDE,
As stated earlier, "Step 1" is a standard process through the Efinity GUI where users can take the Tensorflow Lite models and run it in software on the RISC-V using the very same, familiar process one would with a standard MCU — without having to worry about VHDL. After Step 1, designers will, more often than not, find that the performance of the algorithm they are running is not optimal and therefore requires acceleration. "Step 2" involves hardware-software partitioning where users can implement the fundamental building blocks inside the TensorFlow Lite models and literally click and drag to instantiate custom instructions and get a massive acceleration on the way the model runs on the Sapphire RISC-V core.
User-defined custom instruction accelerator
"Step 3" leaves it open for designers to create their own custom instructions without leveraging the templates found in the TinyML platform, allowing users to innovate and create acceleration on top of the RISC-V core.
Hardware accelerator templates
Finally, after the required fundamental elements are now accelerated on the RISC-V, "Step 4" involves burying them inside the free Efinix SoC framework with "sockets" of acceleration. The quantum accelerator socket allows users to "point at" data, retrieve it, and edit its contents to, say, perform a convolution on bigger blocks of data.
The Sapphire SoC can be used to perform overall system control and execute algorithms that are inherently sequential or require flexibility. As stated earlier, the hardware-software codesign allows users to choose whether to perform this compute in the RISC-V processor or in hardware. In this acceleration methodology, the pre-defined hardware accelerator socket is connected to a direct memory access (DMA) controller and an SoC slave interface for data transfer and CPU control, which may be used for pre-processing/post-processing before or after the AI inference. The DMA controller facilitates communication between the external memory and other building blocks in the design by (Figure 5):
Storing frames of data into the external memory,
Sending and receiving data to/from the hardware acceleration block,
Sending data to the post-processing engine.
In an image-signal-processing application, this can look like leaving the RISC-V processor to execute the RGB to grayscale conversion as embedded software, while the hardware accelerator performs Sobel edge detection, binary erosion, and binary dilation in the pipelined, streaming architecture of the FPGA (see "Edge Vision SoC User Guide"). This can be scaled up for multi-camera vision systems, allowing companies to turn their designs into a product and deploy them extremely rapidly.
MediaPipe Face Mesh use case
The simplicity of this process might be better highlighted with an example. The MediaPipe Face Mesh ML model estimates hundreds of different three-dimensional facial landmarks in real-time. Efinix took this model and deployed it on the Titanium Ti60 development kit running at 300 MHz. As shown inFigure 6, convolutions on the RISC-V core contributed the most to latency. It is worth noting that the FPGA's resource utilization of close to 60% does not actually reflect the size of the ML model. Instead, this is due to the fact that the entire camera subsystem has been instantiated in the FPGA in order to perform acceleration benchmarking in real-time.
Simple custom instructions with the TinyML platform (Step 2)
Creating and running a simple, custom two registers in, one register out convolution instruction shows afour- to five-foldimprovement in latency. This improvement continues as custom instructions used to accelerate the ADD, MAXIMUM, and MUL functions. However, latency improvements hit a plateau since the RISC-V is spending less time doing these operations (Figure 7).
Complex instructions with DMA (Step 4)
An arbitrarily complex ALU is also generated to replace the original CONV. This changes the slope of the original curve and dramatically improves the latency once more. However, FPGA utilization has also jumped up since the complex instruction has taken more resources inside the FPGA.Once again, the resource bar standing at nearly 100% is simply due the fact that the FPGA here contains the entire camera subsystem for demonstration purposes, what is important to note is therelativedecrease in latency and increase in utilization (Figure 8).
What's more, switching to a larger FPGA, such as the Ti180, would run all of these complex instructions for massive acceleration without using even 50 percent of the FPGA resources available. These apparent tradeoffs are precisely what allow engineers to readily visualize the balancing act between latency, power consumption, and the cost/size of the FPGA. An edge application that has stringent latency requirements but more lenient power constraints could opt to increasingly accelerate the design for a drastic performance improvement. In power constrained applications, this increase in performance can be traded off by reducing the clock speed for a more moderate improvement in performance at a dramatically lower power.
A paradigm shift in AI/ML development
In a nutshell, Efinix has combined the familiar development environment of the RISC-V ISA and exploited its custom instruction capability to function within the architecturally flexible FPGA fabric. Unlike many hardware accelerators, this approach does not require any third-party tools or compilers. The acceleration is also fine grain with the acceleration of machine instructions — a level of granularity that only makes sense with an FPGA.
The fact that edge devices can be prototyped and deployed on the innovative design architecture of the Efinix FPGA means the solution is future-proofed. New models and updated network architectures can be expressed in familiar software environments and accelerated at the custom instruction level with only a small amount of VHDL (with libraries of available templates to use for guidance). This degree of hardware-software partitioning where 90 percent of the model remains in software running on the RISC-V allows for an extremely fast time to market. The combination of all of these approaches yields an elegant solution that truly lowers the barriers to entry for implementing an edge device. Designers now have access to a world-class embedded processing capability that can be accessed with a state-of-the-art tool flow and instantiated on the revolutionary Efinix Quantum fabric.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.