Projects
Essentials
kvazaar
Sign Up
Log In
Username
Password
We truncated the diff of some files because they were too big. If you want to see the full diff for every file,
click here
.
Overview
Repositories
Revisions
Requests
Users
Attributes
Meta
Expand all
Collapse all
Changes of Revision 10
View file
kvazaar.changes
Changed
@@ -1,4 +1,63 @@ ------------------------------------------------------------------- +Thu Feb 16 19:50:40 UTC 2017 - aloisio@gmx.com + +- Update to version 1.1.0 + * Both BDRate and speed improved slightly. + Features: + * Bitrate control now works at LCU level, giving more + consistent results. (2318bd7) + * Added --roi parameter for LCU level delta-QP control. + (4a0121a) + * Added --slices parameter for encapsulating tiles and + WPP-rows into slice NAL's instead of using bitstream + offsets. (1e6463c) + * Temporal motion vector prediction now works with B-frames. + (d892be5) + Optimization: + * Added AVX2 version of SSD. (778e46d) + * Optimized intra reference building. (c31207e) + * Optimized bitstream writes. (a9e45ef) + * Optimized CU-split decision. (2c069a3) + * Fix main-thread busy-looping on Linux. (a5a925f) + * Avoid initializing memory needlessly during RDOQ. + (acd12cb, b021d22) + Fixes: + * Pass DTS and PTS timestamps correctly through the API. + (d18de19) + * Fixed bug with subpixel motion estimation within tiles. + (2c005cd) + * Improved 10-bit RD-performance. (70a52f0) + * Fixed for stupendously large bitstreams when + --mv-constraint was used with --subme. (937a764) + * Fixed bug with --smp and --amp. (46c9a48) + * Fix problem with --bipred. (1e6463c) + * Fixed hang with threading on OSX. (d893474) + * Fix crash when frame is less than 65 pixels high and WPP + is used. (b8e3513) + User Interface: + * Disabled WPP with tiles enabled. (cb6672b) + * Improved --help. (5bf7454, 78a28e0) + * Made it possible to disable the gop-structure that was + enabled by default in v.1.0.0. (deb63f7) + * Have --threads=auto enable threading instead of disabling + it. (db5e750) + * Give errors on failures and handle them better. (97863cd, + 6a178de) + * Use reference picture number of medium preset by default. + (7ff33e1) + Building: + * Include optimizations on 32-bit. (1dcc993) + * Added appveyor CI tests for MSYS2. (e269b86) + * Add pkg-config macros, so pkg-config doesn't need to be + installed anymore. (2d7daa1) + * Travis CI OSX tests work again. (c32f5fa) + Refactoring: + * Refactored deblocking and sign hiding. (7ec5f78) + * Removed Exp-Golomb lookup table. (ed3bd89) + * Copy kvz_config to encoder_control_t and remove duplicate + fields. (e78a8df) + +------------------------------------------------------------------- Tue Oct 4 07:43:42 UTC 2016 - aloisio@gmx.com - Update to version 1.0.0
View file
kvazaar.spec
Changed
@@ -1,7 +1,7 @@ # # spec file for package kvazaar # -# Copyright (c) 2016 Packman Team <packman@links2linux.de> +# Copyright (c) 2017 Packman Team <packman@links2linux.de> # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed @@ -18,7 +18,7 @@ %define libname libkvazaar %define libmver 3 Name: kvazaar -Version: 1.0.0 +Version: 1.1.0 Release: 0 Summary: HEVC encoder License: LGPL-2.1
View file
kvazaar-1.0.0.tar.gz/.travis.yml -> kvazaar-1.1.0.tar.gz/.travis.yml
Changed
@@ -22,8 +22,6 @@ matrix: fast_finish: true - allow_failures: - - os: osx # Don't know what's wrong. Something changed in the environment. include: - compiler: clang @@ -115,9 +113,25 @@ - env: TEST_FRAMES=20 VALGRIND_TEST="--gop=8 -p0 --threads=2 --wpp --owf=1 --rd=0 --no-rdoq --no-deblock --no-sao --no-signhide --subme=0 --pu-depth-inter=1-3 --pu-depth-intra=2-3" - env: TEST_FRAMES=10 VALGRIND_TEST="--gop=8 -p0 --threads=2 --wpp --owf=4 --rd=0 --no-rdoq --no-deblock --no-sao --no-signhide --subme=0 --pu-depth-inter=1-3 --pu-depth-intra=2-3" - env: TEST_FRAMES=20 VALGRIND_TEST="--gop=8 -p0 --threads=2 --wpp --owf=0 --rd=0 --no-rdoq --no-deblock --no-sao --no-signhide --subme=0 --pu-depth-inter=1-3 --pu-depth-intra=2-3" + + # Tests for --mv-constraint + - env: VALGRIND_TEST="--threads=2 --owf=1 --preset=ultrafast --pu-depth-inter=0-3 --mv-constraint=frametilemargin" + - env: VALGRIND_TEST="--threads=2 --owf=1 --preset=ultrafast --subme=4 --mv-constraint=frametilemargin" + + # Tests for --slices + - env: TEST_DIM=512x256 VALGRIND_TEST="--threads=2 --owf=1 --preset=ultrafast --tiles=2x2 --slices=tiles" + - env: VALGRIND_TEST="--threads=2 --owf=1 --preset=ultrafast --slices=wpp" + + # Test weird shapes. + - env: TEST_DIM=16x16 VALGRIND_TEST="--threads=2 --owf=1 --preset=veryslow" + - env: TEST_DIM=256x16 VALGRIND_TEST="--threads=2 --owf=1 --preset=veryslow" + - env: TEST_DIM=16x256 VALGRIND_TEST="--threads=2 --owf=1 --preset=veryslow" install: - source .travis-install.sh script: - source .travis-script.sh + +after_script: + - set +e # Disable errors to work around Travis not knowing how to fix their stuff.
View file
kvazaar-1.0.0.tar.gz/Makefile.am -> kvazaar-1.1.0.tar.gz/Makefile.am
Changed
@@ -15,6 +15,6 @@ # Run scripts to maintain autogenerated documentation # in the version control. -docs: +docs: all ./tools/genmanpage.sh ./tools/update_readme.sh ./tools/genmanpage.sh ./tools/update_readme.sh
View file
kvazaar-1.0.0.tar.gz/README.md -> kvazaar-1.1.0.tar.gz/README.md
Changed
@@ -4,180 +4,221 @@ Join channel #kvazaar_hevc in Freenode IRC network to contact us. -Kvazaar is not yet finished and does not implement all the features of -HEVC. Compression performance will increase as we add more coding tools. +Kvazaar is still under development. Speed and RD-quality will continue to improve. http://ultravideo.cs.tut.fi/#encoder for more information. -[![Build Status](https://travis-ci.org/ultravideo/kvazaar.svg?branch=master)](https://travis-ci.org/ultravideo/kvazaar) +- Linux/Mac [![Build Status](https://travis-ci.org/ultravideo/kvazaar.svg?branch=master)](https://travis-ci.org/ultravideo/kvazaar) +- Windows [![Build status](https://ci.appveyor.com/api/projects/status/88sg1h25lp0k71pu?svg=true)](https://ci.appveyor.com/project/Ultravideo/kvazaar) ## Using Kvazaar +### Example: + + kvazaar --input BQMall_832x480_60.yuv --output out.hevc + +The mandatory parameters are input and output. If the resolution of the input file is not in the filename, or when pipe is used, the input resolution must also be given: ```--input-res=1920x1080```. + +The default input format is 8-bit yuv420p for 8-bit and yuv420p10le for 10-bit. Input format and bitdepth can be selected with ```--input-format``` and ```--input-bitdepth```. + +Speed and compression quality can be selected with ```--preset```, or by setting the options manually. + +### Parameters + [comment]: # (BEGIN KVAZAAR HELP MESSAGE) ``` Usage: kvazaar -i <input> --input-res <width>x<height> -o <output> -Optional parameters: - --help : Print this help message and exit - --version : Print version information and exit - -n, --frames <integer> : Number of frames to code [all] - --seek <integer> : First frame to code [0] - --input-res <int>x<int> : Input resolution (width x height) or - auto : try to detect from file name [auto] - --input-fps <num>/<denom> : Framerate of the input video [25.0] - -q, --qp <integer> : Quantization Parameter [32] - -p, --period <integer> : Period of intra pictures [0] - 0: only first picture is intra - 1: all pictures are intra - 2-N: every Nth picture is intra - --vps-period <integer> : Specify how often the video parameter set is - re-sent. [0] - 0: only send VPS with the first frame - 1: send VPS with every intra frame - N: send VPS with every Nth intra frame - -r, --ref <integer> : Reference frames, range 1..15 [3] - --no-deblock : Disable deblocking filter - --deblock <beta:tc> : Deblocking filter parameters - beta and tc range is -6..6 [0:0] - --no-sao : Disable sample adaptive offset - --no-rdoq : Disable RDO quantization - --no-signhide : Disable sign hiding in quantization - --smp : Enable Symmetric Motion Partition - --amp : Enable Asymmetric Motion Partition - --rd <integer> : Rate-Distortion Optimization level [1] - 0: no RDO - 1: estimated RDO - 2: full RDO - --mv-rdo : Enable Rate-Distortion Optimized motion vector costs - --full-intra-search : Try all intra modes. - --no-transform-skip : Disable transform skip - --aud : Use access unit delimiters - --cqmfile <string> : Custom Quantization Matrices from a file - --debug <string> : Output encoders reconstruction. - --cpuid <integer> : Disable runtime cpu optimizations with value 0. - --me <string> : Set integer motion estimation algorithm ["hexbs"] - "hexbs": Hexagon Based Search (faster) - "tz": Test Zone Search (better quality) - "full": Full Search (super slow) - --subme <integer> : Set fractional pixel motion estimation level [4]. - 0: only integer motion estimation - 1: + 1/2-pixel horizontal and vertical - 2: + 1/2-pixel diagonal - 3: + 1/4-pixel horizontal and vertical - 4: + 1/4-pixel diagonal - --source-scan-type <string> : Set source scan type ["progressive"]. - "progressive": progressive scan - "tff": top field first - "bff": bottom field first - --pu-depth-inter <int>-<int> : Range for sizes of inter prediction units to try. - 0: 64x64, 1: 32x32, 2: 16x16, 3: 8x8 - --pu-depth-intra <int>-<int> : Range for sizes of intra prediction units to try. - 0: 64x64, 1: 32x32, 2: 16x16, 3: 8x8, 4: 4x4 - --no-info : Don't add information about the encoder to settings. - --gop <string> : Definition of GOP structure [0] - "0": disabled - "8": B-frame pyramid of length 8 - "lp-<string>": lp-gop definition (e.g. lp-g8d4r3t2) - --bipred : Enable bi-prediction search - --bitrate <integer> : Target bitrate. [0] - 0: disable rate-control - N: target N bits per second - --preset <string> : Use preset. This will override previous options. - ultrafast, superfast, veryfast, faster, - fast, medium, slow, slower, veryslow, placebo - --no-psnr : Don't calculate PSNR for frames - --loop-input : Re-read input file forever - --mv-constraint : Constrain movement vectors - "none": no constraint - "frametile": constrain within the tile - "frametilemargin": constrain even more - --hash : Specify which decoded picture hash to use [checksum] - "none": 0 bytes - "checksum": 18 bytes - "md5": 56 bytes - --cu-split-termination : Specify the cu split termination behaviour - "zero": Terminate when splitting gives little - improvement. - "off": Don't terminate splitting early - --me-early-termination : Specify the me early termination behaviour - "off": Early termination is off - "on": Early termination is on - "sensitive": Sensitive early termination is on - --lossless : Use lossless coding - --implicit-rdpcm : Enable implicit residual DPCM. Currently only supported - with lossless coding. - --no-tmvp : Disable Temporal Motion Vector Prediction - --rdoq-skip : Skips RDOQ for 4x4 blocks - --input-format : P420 or P400 - --input-bitdepth : 8-16 - - Video Usability Information: - --sar <width:height> : Specify Sample Aspect Ratio - --overscan <string> : Specify crop overscan setting ["undef"] - - undef, show, crop - --videoformat <string> : Specify video format ["undef"] - - component, pal, ntsc, secam, mac, undef - --range <string> : Specify color range ["tv"] - - tv, pc - --colorprim <string> : Specify color primaries ["undef"] - - undef, bt709, bt470m, bt470bg, - smpte170m, smpte240m, film, bt2020 - --transfer <string> : Specify transfer characteristics ["undef"] - - undef, bt709, bt470m, bt470bg, - smpte170m, smpte240m, linear, log100, - log316, iec61966-2-4, bt1361e, - iec61966-2-1, bt2020-10, bt2020-12 - --colormatrix <string> : Specify color matrix setting ["undef"] - - undef, bt709, fcc, bt470bg, smpte170m, - smpte240m, GBR, YCgCo, bt2020nc, bt2020c - --chromaloc <integer> : Specify chroma sample location (0 to 5) [0] - - Parallel processing: - --threads <integer> : Maximum number of threads to use. - Disable threads if set to 0. - - Tiles: - --tiles <int>x<int> : Split picture into width x height uniform tiles. - --tiles-width-split <string>|u<int> : - Specifies a comma separated list of pixel - positions of tiles columns separation coordinates. - Can also be u followed by and a single int n, - in which case it produces columns of uniform width. - --tiles-height-split <string>|u<int> : - Specifies a comma separated list of pixel - positions of tiles rows separation coordinates. - Can also be u followed by and a single int n, - in which case it produces rows of uniform height. - - Wpp: - --wpp : Enable wavefront parallel processing - --owf <integer>|auto : Number of parallel frames to process. 0 to disable. - - Deprecated parameters: (might be removed at some point) - Use --input-res: - -w, --width : Width of input in pixels - -h, --height : Height of input in pixels +Required: + -i, --input : Input file + --input-res <res> : Input resolution [auto] + auto: detect from file name + <int>x<int>: width times height + -o, --output : Output file + +Presets: + --preset=<preset> : Set options to a preset [medium] + - ultrafast, superfast, veryfast, faster, + fast, medium, slow, slower, veryslow + placebo + +Input: + -n, --frames <integer> : Number of frames to code [all] + --seek <integer> : First frame to code [0] + --input-fps <num>/<denom> : Framerate of the input video [25.0] + --source-scan-type <string> : Set source scan type [progressive]. + - progressive: progressive scan + - tff: top field first + - bff: bottom field first + --input-format : P420 or P400 + --input-bitdepth : 8-16 + --loop-input : Re-read input file forever + +Options: + --help : Print this help message and exit
View file
kvazaar-1.1.0.tar.gz/appveyor.yml
Added
@@ -0,0 +1,28 @@ +branches: + only: + - master + - appveyor + +environment: + matrix: + - MSYSTEM: MINGW64 + - MSYSTEM: MINGW32 + +shallow_clone: true +test: off + +install: + # Update core packages + - C:\msys64\usr\bin\pacman -Syyuu --noconfirm --noprogressbar + # Update non-core packages + - C:\msys64\usr\bin\pacman -Suu --noconfirm --noprogressbar + # Install required MSYS2 packages + - C:\msys64\usr\bin\pacman -S --noconfirm --noprogressbar --needed automake-wrapper make + # Now MSYS2 is up to date, do the rest of the install from a bash script + - C:\msys64\usr\bin\bash -lc "cd \"$APPVEYOR_BUILD_FOLDER\" && exec ./tools/appveyor-install.sh" + +build_script: + - C:\msys64\usr\bin\bash -lc "cd \"$APPVEYOR_BUILD_FOLDER\" && exec ./tools/appveyor-build.sh" + +cache: + - C:\msys64\var\cache\pacman\pkg
View file
kvazaar-1.0.0.tar.gz/configure.ac -> kvazaar-1.1.0.tar.gz/configure.ac
Changed
@@ -23,7 +23,7 @@ # # Here is a somewhat sane guide to lib versioning: http://apr.apache.org/versioning.html ver_major=3 -ver_minor=13 +ver_minor=15 ver_release=0 # Prevents configure from adding a lot of defines to the CFLAGS @@ -59,12 +59,15 @@ AC_ARG_WITH([cryptopp], AS_HELP_STRING([--with-cryptopp], [Build with cryptopp Enables selective encryption.])) -AS_IF([test "x$with_cryptopp" = "xyes"], [ - PKG_CHECK_MODULES([cryptopp], [cryptopp], +AS_IF([test "x$with_cryptopp" = "xyes"], + [PKG_CHECK_MODULES([cryptopp], [cryptopp], [AC_DEFINE([KVZ_SEL_ENCRYPTION], [1], [With cryptopp])], - [AC_MSG_ERROR([cryptopp not found with pkg-config])] - ) -]) + [PKG_CHECK_MODULES([cryptopp], [libcrypto++], + [AC_DEFINE([KVZ_SEL_ENCRYPTION], [1], [With cryptopp])], + [AC_MSG_ERROR([neither cryptopp nor libcrypto++ found with pkg-config])] + )] + )] +) AM_CONDITIONAL([USE_CRYPTOPP], [test "x$with_cryptopp" = "xyes"]) CPPFLAGS="$CPPFLAGS $cryptopp_CFLAGS"
View file
kvazaar-1.0.0.tar.gz/doc/kvazaar.1 -> kvazaar-1.1.0.tar.gz/doc/kvazaar.1
Changed
@@ -1,241 +1,242 @@ -.TH KVAZAAR "1" "October 2016" "kvazaar v0.8.3" "User Commands" +.TH KVAZAAR "1" "February 2017" "kvazaar v1.1.0" "User Commands" .SH NAME kvazaar \- open source HEVC encoder .SH SYNOPSIS \fBkvazaar \fR\-i <input> \-\-input\-res <width>x<height> \-o <output> .SH DESCRIPTION .TP -\fB\-\-help -Print this help message and exit +\fB\-i\fR, \fB\-\-input +Input file .TP -\fB\-\-version -Print version information and exit +\fB\-\-input\-res <res> +Input resolution [auto] +auto: detect from file name +<int>x<int>: width times height +.TP +\fB\-o\fR, \fB\-\-output +Output file + +.SS "Presets:" +.TP +\fB\-\-preset=<preset> +Set options to a preset [medium] + \- ultrafast, superfast, veryfast, faster, + fast, medium, slow, slower, veryslow + placebo + +.SS "Input:" .TP \fB\-n\fR, \fB\-\-frames <integer> Number of frames to code [all] .TP -\fB\-\-seek <integer> +\fB\-\-seek <integer> First frame to code [0] .TP -\fB\-\-input\-res <int>x<int> -Input resolution (width x height) or -auto -try to detect from file name [auto] -.TP -\fB\-\-input\-fps <num>/<denom> +\fB\-\-input\-fps <num>/<denom> Framerate of the input video [25.0] .TP -\fB\-q\fR, \fB\-\-qp <integer> -Quantization Parameter [32] -.TP -\fB\-p\fR, \fB\-\-period <integer> -Period of intra pictures [0] - 0: only first picture is intra - 1: all pictures are intra - 2\-N: every Nth picture is intra -.TP -\fB\-\-vps\-period <integer> -Specify how often the video parameter set is -re\-sent. [0] - 0: only send VPS with the first frame - 1: send VPS with every intra frame - N: send VPS with every Nth intra frame -.TP -\fB\-r\fR, \fB\-\-ref <integer> -Reference frames, range 1..15 [3] -.TP -\fB\-\-no\-deblock -Disable deblocking filter -.TP -\fB\-\-deblock <beta:tc> -Deblocking filter parameters -beta and tc range is \-6..6 [0:0] -.TP -\fB\-\-no\-sao -Disable sample adaptive offset -.TP -\fB\-\-no\-rdoq -Disable RDO quantization -.TP -\fB\-\-no\-signhide -Disable sign hiding in quantization -.TP -\fB\-\-smp -Enable Symmetric Motion Partition +\fB\-\-source\-scan\-type <string> +Set source scan type [progressive]. + \- progressive: progressive scan + \- tff: top field first + \- bff: bottom field first .TP -\fB\-\-amp -Enable Asymmetric Motion Partition +\fB\-\-input\-format +P420 or P400 .TP -\fB\-\-rd <integer> -Rate\-Distortion Optimization level [1] - 0: no RDO - 1: estimated RDO - 2: full RDO +\fB\-\-input\-bitdepth +8\-16 .TP -\fB\-\-mv\-rdo -Enable Rate\-Distortion Optimized motion vector costs +\fB\-\-loop\-input +Re\-read input file forever + +.SS "Options:" .TP -\fB\-\-full\-intra\-search -Try all intra modes. +\fB\-\-help +Print this help message and exit .TP -\fB\-\-no\-transform\-skip -Disable transform skip +\fB\-\-version +Print version information and exit .TP \fB\-\-aud Use access unit delimiters .TP -\fB\-\-cqmfile <string> -Custom Quantization Matrices from a file -.TP \fB\-\-debug <string> Output encoders reconstruction. .TP \fB\-\-cpuid <integer> Disable runtime cpu optimizations with value 0. .TP -\fB\-\-me <string> -Set integer motion estimation algorithm ["hexbs"] - "hexbs": Hexagon Based Search (faster) - "tz": Test Zone Search (better quality) - "full": Full Search (super slow) +\fB\-\-hash +Decoded picture hash [checksum] + \- none: 0 bytes + \- checksum: 18 bytes + \- md5: 56 bytes .TP -\fB\-\-subme <integer> -Set fractional pixel motion estimation level [4]. - 0: only integer motion estimation - 1: + 1/2\-pixel horizontal and vertical - 2: + 1/2\-pixel diagonal - 3: + 1/4\-pixel horizontal and vertical - 4: + 1/4\-pixel diagonal +\fB\-\-no\-psnr +Don't calculate PSNR for frames .TP -\fB\-\-source\-scan\-type <string> -Set source scan type ["progressive"]. - "progressive": progressive scan - "tff": top field first - "bff": bottom field first +\fB\-\-no\-info +Don't add encoder info SEI. + +.SS "Video structure:" .TP -\fB\-\-pu\-depth\-inter <int>\-<int> -Range for sizes of inter prediction units to try. - 0: 64x64, 1: 32x32, 2: 16x16, 3: 8x8 +\fB\-q\fR, \fB\-\-qp <integer> +Quantization Parameter [32] .TP -\fB\-\-pu\-depth\-intra <int>\-<int> -Range for sizes of intra prediction units to try. - 0: 64x64, 1: 32x32, 2: 16x16, 3: 8x8, 4: 4x4 +\fB\-p\fR, \fB\-\-period <integer> +Period of intra pictures [0] +\- 0: only first picture is intra +\- 1: all pictures are intra +\- 2\-N: every Nth picture is intra .TP -\fB\-\-no\-info -Don't add information about the encoder to settings. +\fB\-\-vps\-period <integer> +Specify how often the video parameter set is +re\-sent. [0] + \- 0: only send VPS with the first frame + \- N: send VPS with every Nth intra frame +.TP +\fB\-r\fR, \fB\-\-ref <integer> +Reference frames, range 1..15 [3] .TP \fB\-\-gop <string> Definition of GOP structure [0] - "0": disabled - "8": B\-frame pyramid of length 8 - "lp\-<string>": lp\-gop definition (e.g. lp\-g8d4r3t2) + \- 0: disabled + \- 8: B\-frame pyramid of length 8 + \- lp\-<string>: lp\-gop definition
View file
kvazaar-1.1.0.tar.gz/m4/pkg.m4
Added
@@ -0,0 +1,275 @@ +dnl pkg.m4 - Macros to locate and utilise pkg-config. -*- Autoconf -*- +dnl serial 11 (pkg-config-0.29.1) +dnl +dnl Copyright © 2004 Scott James Remnant <scott@netsplit.com>. +dnl Copyright © 2012-2015 Dan Nicholson <dbn.lists@gmail.com> +dnl +dnl This program is free software; you can redistribute it and/or modify +dnl it under the terms of the GNU General Public License as published by +dnl the Free Software Foundation; either version 2 of the License, or +dnl (at your option) any later version. +dnl +dnl This program is distributed in the hope that it will be useful, but +dnl WITHOUT ANY WARRANTY; without even the implied warranty of +dnl MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +dnl General Public License for more details. +dnl +dnl You should have received a copy of the GNU General Public License +dnl along with this program; if not, write to the Free Software +dnl Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA +dnl 02111-1307, USA. +dnl +dnl As a special exception to the GNU General Public License, if you +dnl distribute this file as part of a program that contains a +dnl configuration script generated by Autoconf, you may include it under +dnl the same distribution terms that you use for the rest of that +dnl program. + +dnl PKG_PREREQ(MIN-VERSION) +dnl ----------------------- +dnl Since: 0.29 +dnl +dnl Verify that the version of the pkg-config macros are at least +dnl MIN-VERSION. Unlike PKG_PROG_PKG_CONFIG, which checks the user's +dnl installed version of pkg-config, this checks the developer's version +dnl of pkg.m4 when generating configure. +dnl +dnl To ensure that this macro is defined, also add: +dnl m4_ifndef([PKG_PREREQ], +dnl [m4_fatal([must install pkg-config 0.29 or later before running autoconf/autogen])]) +dnl +dnl See the "Since" comment for each macro you use to see what version +dnl of the macros you require. +m4_defun([PKG_PREREQ], +[m4_define([PKG_MACROS_VERSION], [0.29.1]) +m4_if(m4_version_compare(PKG_MACROS_VERSION, [$1]), -1, + [m4_fatal([pkg.m4 version $1 or higher is required but ]PKG_MACROS_VERSION[ found])]) +])dnl PKG_PREREQ + +dnl PKG_PROG_PKG_CONFIG([MIN-VERSION]) +dnl ---------------------------------- +dnl Since: 0.16 +dnl +dnl Search for the pkg-config tool and set the PKG_CONFIG variable to +dnl first found in the path. Checks that the version of pkg-config found +dnl is at least MIN-VERSION. If MIN-VERSION is not specified, 0.9.0 is +dnl used since that's the first version where most current features of +dnl pkg-config existed. +AC_DEFUN([PKG_PROG_PKG_CONFIG], +[m4_pattern_forbid([^_?PKG_[A-Z_]+$]) +m4_pattern_allow([^PKG_CONFIG(_(PATH|LIBDIR|SYSROOT_DIR|ALLOW_SYSTEM_(CFLAGS|LIBS)))?$]) +m4_pattern_allow([^PKG_CONFIG_(DISABLE_UNINSTALLED|TOP_BUILD_DIR|DEBUG_SPEW)$]) +AC_ARG_VAR([PKG_CONFIG], [path to pkg-config utility]) +AC_ARG_VAR([PKG_CONFIG_PATH], [directories to add to pkg-config's search path]) +AC_ARG_VAR([PKG_CONFIG_LIBDIR], [path overriding pkg-config's built-in search path]) + +if test "x$ac_cv_env_PKG_CONFIG_set" != "xset"; then + AC_PATH_TOOL([PKG_CONFIG], [pkg-config]) +fi +if test -n "$PKG_CONFIG"; then + _pkg_min_version=m4_default([$1], [0.9.0]) + AC_MSG_CHECKING([pkg-config is at least version $_pkg_min_version]) + if $PKG_CONFIG --atleast-pkgconfig-version $_pkg_min_version; then + AC_MSG_RESULT([yes]) + else + AC_MSG_RESULT([no]) + PKG_CONFIG="" + fi +fi[]dnl +])dnl PKG_PROG_PKG_CONFIG + +dnl PKG_CHECK_EXISTS(MODULES, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND]) +dnl ------------------------------------------------------------------- +dnl Since: 0.18 +dnl +dnl Check to see whether a particular set of modules exists. Similar to +dnl PKG_CHECK_MODULES(), but does not set variables or print errors. +dnl +dnl Please remember that m4 expands AC_REQUIRE([PKG_PROG_PKG_CONFIG]) +dnl only at the first occurence in configure.ac, so if the first place +dnl it's called might be skipped (such as if it is within an "if", you +dnl have to call PKG_CHECK_EXISTS manually +AC_DEFUN([PKG_CHECK_EXISTS], +[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl +if test -n "$PKG_CONFIG" && \ + AC_RUN_LOG([$PKG_CONFIG --exists --print-errors "$1"]); then + m4_default([$2], [:]) +m4_ifvaln([$3], [else + $3])dnl +fi]) + +dnl _PKG_CONFIG([VARIABLE], [COMMAND], [MODULES]) +dnl --------------------------------------------- +dnl Internal wrapper calling pkg-config via PKG_CONFIG and setting +dnl pkg_failed based on the result. +m4_define([_PKG_CONFIG], +[if test -n "$$1"; then + pkg_cv_[]$1="$$1" + elif test -n "$PKG_CONFIG"; then + PKG_CHECK_EXISTS([$3], + [pkg_cv_[]$1=`$PKG_CONFIG --[]$2 "$3" 2>/dev/null` + test "x$?" != "x0" && pkg_failed=yes ], + [pkg_failed=yes]) + else + pkg_failed=untried +fi[]dnl +])dnl _PKG_CONFIG + +dnl _PKG_SHORT_ERRORS_SUPPORTED +dnl --------------------------- +dnl Internal check to see if pkg-config supports short errors. +AC_DEFUN([_PKG_SHORT_ERRORS_SUPPORTED], +[AC_REQUIRE([PKG_PROG_PKG_CONFIG]) +if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then + _pkg_short_errors_supported=yes +else + _pkg_short_errors_supported=no +fi[]dnl +])dnl _PKG_SHORT_ERRORS_SUPPORTED + + +dnl PKG_CHECK_MODULES(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND], +dnl [ACTION-IF-NOT-FOUND]) +dnl -------------------------------------------------------------- +dnl Since: 0.4.0 +dnl +dnl Note that if there is a possibility the first call to +dnl PKG_CHECK_MODULES might not happen, you should be sure to include an +dnl explicit call to PKG_PROG_PKG_CONFIG in your configure.ac +AC_DEFUN([PKG_CHECK_MODULES], +[AC_REQUIRE([PKG_PROG_PKG_CONFIG])dnl +AC_ARG_VAR([$1][_CFLAGS], [C compiler flags for $1, overriding pkg-config])dnl +AC_ARG_VAR([$1][_LIBS], [linker flags for $1, overriding pkg-config])dnl + +pkg_failed=no +AC_MSG_CHECKING([for $1]) + +_PKG_CONFIG([$1][_CFLAGS], [cflags], [$2]) +_PKG_CONFIG([$1][_LIBS], [libs], [$2]) + +m4_define([_PKG_TEXT], [Alternatively, you may set the environment variables $1[]_CFLAGS +and $1[]_LIBS to avoid the need to call pkg-config. +See the pkg-config man page for more details.]) + +if test $pkg_failed = yes; then + AC_MSG_RESULT([no]) + _PKG_SHORT_ERRORS_SUPPORTED + if test $_pkg_short_errors_supported = yes; then + $1[]_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "$2" 2>&1` + else + $1[]_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "$2" 2>&1` + fi + # Put the nasty error message in config.log where it belongs + echo "$$1[]_PKG_ERRORS" >&AS_MESSAGE_LOG_FD + + m4_default([$4], [AC_MSG_ERROR( +[Package requirements ($2) were not met: + +$$1_PKG_ERRORS + +Consider adjusting the PKG_CONFIG_PATH environment variable if you +installed software in a non-standard prefix. + +_PKG_TEXT])[]dnl + ]) +elif test $pkg_failed = untried; then + AC_MSG_RESULT([no]) + m4_default([$4], [AC_MSG_FAILURE( +[The pkg-config script could not be found or is too old. Make sure it +is in your PATH or set the PKG_CONFIG environment variable to the full +path to pkg-config. + +_PKG_TEXT + +To get pkg-config, see <http://pkg-config.freedesktop.org/>.])[]dnl + ]) +else + $1[]_CFLAGS=$pkg_cv_[]$1[]_CFLAGS + $1[]_LIBS=$pkg_cv_[]$1[]_LIBS + AC_MSG_RESULT([yes]) + $3 +fi[]dnl +])dnl PKG_CHECK_MODULES + + +dnl PKG_CHECK_MODULES_STATIC(VARIABLE-PREFIX, MODULES, [ACTION-IF-FOUND], +dnl [ACTION-IF-NOT-FOUND]) +dnl --------------------------------------------------------------------- +dnl Since: 0.29 +dnl
View file
kvazaar-1.0.0.tar.gz/src/bitstream.c -> kvazaar-1.1.0.tar.gz/src/bitstream.c
Changed
@@ -39,8 +39,6 @@ 0x10000000,0x20000000,0x40000000,0x80000000 }; -bit_table_t kvz_g_exp_table[EXP_GOLOMB_TABLE_SIZE]; - //#define VERBOSE @@ -57,29 +55,6 @@ #endif /** - * \brief Initialize the Exp Golomb code table. - * - * Fills kvz_g_exp_table with exponential golomb codes. - */ -void kvz_init_exp_golomb() -{ - static int exp_table_initialized = 0; - if (exp_table_initialized) return; - - uint32_t code_num; - uint8_t M; - uint32_t info; - for (code_num = 0; code_num < EXP_GOLOMB_TABLE_SIZE; code_num++) { - M = kvz_math_floor_log2(code_num + 1); - info = code_num + 1 - (uint32_t)pow(2, M); - kvz_g_exp_table[code_num].len = M * 2 + 1; - kvz_g_exp_table[code_num].value = (1<<M) | info; - } - - exp_table_initialized = 1; -} - -/** * \brief Initialize a new bitstream. */ void kvz_bitstream_init(bitstream_t *const stream) @@ -217,15 +192,33 @@ } /** + * \brief Write a byte to a byte aligned bitstream + * \param stream stream the data is to be appended to + * \param data input data + */ +void kvz_bitstream_put_byte(bitstream_t *const stream, uint32_t data) +{ + assert(stream->cur_bit == 0); + const uint8_t emulation_prevention_three_byte = 0x03; + + if ((stream->zerocount == 2) && (data < 4)) { + kvz_bitstream_writebyte(stream, emulation_prevention_three_byte); + stream->zerocount = 0; + } + stream->zerocount = data == 0 ? stream->zerocount + 1 : 0; + kvz_bitstream_writebyte(stream, data); +} + +/** * \brief Write bits to bitstream - * \param stream pointer bitstream to put the data - * \param data input data - * \param bits number of bits to write from data to stream + * Buffers individual bits untill they make a full byte. + * \param stream stream the data is to be appended to + * \param data input data + * \param bits number of bits to write from data to stream */ void kvz_bitstream_put(bitstream_t *const stream, const uint32_t data, uint8_t bits) { - const uint8_t emulation_prevention_three_byte = 0x03; - while(bits--) { + while (bits--) { stream->data <<= 1; if (data & kvz_bit_set_mask[bits]) { @@ -234,23 +227,38 @@ stream->cur_bit++; // write byte to output - if (stream->cur_bit==8) { + if (stream->cur_bit == 8) { stream->cur_bit = 0; - if((stream->zerocount == 2) && (stream->data < 4)) { - kvz_bitstream_writebyte(stream, emulation_prevention_three_byte); - stream->zerocount = 0; - } - if(stream->data == 0) { - stream->zerocount++; - } else { - stream->zerocount = 0; - } - kvz_bitstream_writebyte(stream, stream->data); + kvz_bitstream_put_byte(stream, stream->data); } } } /** + * \brief Write unsigned Exp-Golomb bit string + */ +void kvz_bitstream_put_ue(bitstream_t *stream, uint32_t code_num) +{ + unsigned code_num_log2 = kvz_math_floor_log2(code_num + 1); + unsigned prefix = 1 << code_num_log2; + unsigned suffix = code_num + 1 - prefix; + unsigned num_bits = code_num_log2 * 2 + 1; + unsigned value = prefix | suffix; + + kvz_bitstream_put(stream, value, num_bits); +} + +/** + * \brief Write signed Exp-Golomb bit string + */ +void kvz_bitstream_put_se(bitstream_t *stream, int32_t data) +{ + // Map positive values to even and negative to odd values. + uint32_t code_num = data <= 0 ? (-data) << 1 : (data << 1) - 1; + kvz_bitstream_put_ue(stream, code_num); +} + +/** * \brief Add rbsp_trailing_bits syntax element, which aligns the bitstream. */ void kvz_bitstream_add_rbsp_trailing_bits(bitstream_t * const stream)
View file
kvazaar-1.0.0.tar.gz/src/bitstream.h -> kvazaar-1.1.0.tar.gz/src/bitstream.h
Changed
@@ -60,10 +60,6 @@ uint32_t value; } bit_table_t; -extern bit_table_t kvz_g_exp_table[EXP_GOLOMB_TABLE_SIZE]; - -void kvz_init_exp_golomb(); - void kvz_bitstream_init(bitstream_t * stream); kvz_data_chunk * kvz_bitstream_alloc_chunk(); kvz_data_chunk * kvz_bitstream_take_chunks(bitstream_t *stream); @@ -77,10 +73,10 @@ void kvz_bitstream_clear(bitstream_t *stream); void kvz_bitstream_put(bitstream_t *stream, uint32_t data, uint8_t bits); -/* Use macros to force inlining */ -#define bitstream_put_ue(stream, data) { kvz_bitstream_put(stream,kvz_g_exp_table[data].value,kvz_g_exp_table[data].len); } -#define bitstream_put_se(stream, data) { uint32_t index=(uint32_t)(((data)<=0)?(-(data))<<1:((data)<<1)-1); \ - kvz_bitstream_put(stream,kvz_g_exp_table[index].value,kvz_g_exp_table[index].len); } +void kvz_bitstream_put_byte(bitstream_t *const stream, const uint32_t data); + +void kvz_bitstream_put_ue(bitstream_t *stream, uint32_t data); +void kvz_bitstream_put_se(bitstream_t *stream, int32_t data); void kvz_bitstream_add_rbsp_trailing_bits(bitstream_t *stream); void kvz_bitstream_align(bitstream_t *stream); @@ -90,12 +86,12 @@ #ifdef KVZ_DEBUG_PRINT_CABAC /* Counter to keep up with bits written */ #define WRITE_U(stream, data, bits, name) { printf("%-40s u(%d) : %d\n", name,bits,data); kvz_bitstream_put(stream,data,bits);} -#define WRITE_UE(stream, data, name) { printf("%-40s ue(v): %d\n", name,data); bitstream_put_ue(stream,data);} -#define WRITE_SE(stream, data, name) { printf("%-40s se(v): %d\n", name,data); bitstream_put_se(stream,(data));} +#define WRITE_UE(stream, data, name) { printf("%-40s ue(v): %d\n", name,data); kvz_bitstream_put_ue(stream,data);} +#define WRITE_SE(stream, data, name) { printf("%-40s se(v): %d\n", name,data); kvz_bitstream_put_se(stream,(data));} #else #define WRITE_U(stream, data, bits, name) { kvz_bitstream_put(stream,data,bits); } -#define WRITE_UE(stream, data, name) { bitstream_put_ue(stream,data); } -#define WRITE_SE(stream, data, name) { bitstream_put_se(stream,data); } +#define WRITE_UE(stream, data, name) { kvz_bitstream_put_ue(stream,data); } +#define WRITE_SE(stream, data, name) { kvz_bitstream_put_se(stream,data); } #endif
View file
kvazaar-1.0.0.tar.gz/src/cabac.c -> kvazaar-1.1.0.tar.gz/src/cabac.c
Changed
@@ -141,11 +141,11 @@ uint32_t carry = lead_byte >> 8; uint32_t byte = data->buffered_byte + carry; data->buffered_byte = lead_byte & 0xff; - kvz_bitstream_put(data->stream, byte, 8); + kvz_bitstream_put_byte(data->stream, byte); byte = (0xff + carry) & 0xff; while (data->num_buffered_bytes > 1) { - kvz_bitstream_put(data->stream, byte, 8); + kvz_bitstream_put_byte(data->stream, byte); data->num_buffered_bytes--; } } else { @@ -163,18 +163,18 @@ assert(data->bits_left <= 32); if (data->low >> (32 - data->bits_left)) { - kvz_bitstream_put(data->stream,data->buffered_byte + 1, 8); + kvz_bitstream_put_byte(data->stream, data->buffered_byte + 1); while (data->num_buffered_bytes > 1) { - kvz_bitstream_put(data->stream, 0, 8); + kvz_bitstream_put_byte(data->stream, 0); data->num_buffered_bytes--; } data->low -= 1 << (32 - data->bits_left); } else { if (data->num_buffered_bytes > 0) { - kvz_bitstream_put(data->stream,data->buffered_byte, 8); + kvz_bitstream_put_byte(data->stream, data->buffered_byte); } while (data->num_buffered_bytes > 1) { - kvz_bitstream_put(data->stream, 0xff, 8); + kvz_bitstream_put_byte(data->stream, 0xff); data->num_buffered_bytes--; } } @@ -213,17 +213,6 @@ /** * \brief */ -void kvz_cabac_flush(cabac_data_t * const data) -{ - kvz_cabac_finish(data); - kvz_bitstream_put(data->stream, 1, 1); - kvz_bitstream_align_zero(data->stream); - kvz_cabac_start(data); -} - -/** - * \brief - */ void kvz_cabac_encode_bin_ep(cabac_data_t * const data, const uint32_t bin_value) { data->low <<= 1; @@ -559,13 +548,14 @@ bins = (bins << count) | symbol; num_bins += count; - if(!state->cabac.only_count) - if (state->encoder_control->cfg->crypto_features & KVZ_CRYPTO_MVs) { + if (!state->cabac.only_count) { + if (state->encoder_control->cfg.crypto_features & KVZ_CRYPTO_MVs) { uint32_t key, mask; key = ff_get_key(&state->tile->dbs_g, num_bins>>1); mask = ( (1<<(num_bins >>1) ) -1 ); state->tile->m_prev_pos = ( bins + ( state->tile->m_prev_pos^key ) ) & mask; bins = ( (bins >> (num_bins >>1) ) << (num_bins >>1) ) | state->tile->m_prev_pos; } + } kvz_cabac_encode_bins_ep(data, bins, num_bins); }
View file
kvazaar-1.0.0.tar.gz/src/cabac.h -> kvazaar-1.1.0.tar.gz/src/cabac.h
Changed
@@ -60,6 +60,7 @@ cabac_ctx_t trans_subdiv_model[3]; //!< \brief intra mode context models cabac_ctx_t qt_cbf_model_luma[4]; cabac_ctx_t qt_cbf_model_chroma[4]; + cabac_ctx_t cu_qp_delta_abs[4]; cabac_ctx_t part_size_model[4]; cabac_ctx_t cu_sig_coeff_group_model[4]; cabac_ctx_t cu_sig_model_luma[27]; @@ -102,7 +103,6 @@ void kvz_cabac_encode_bin_trm(cabac_data_t *data, uint8_t bin_value); void kvz_cabac_write(cabac_data_t *data); void kvz_cabac_finish(cabac_data_t *data); -void kvz_cabac_flush(cabac_data_t *data); void kvz_cabac_write_coeff_remain(cabac_data_t *cabac, uint32_t symbol, uint32_t r_param); void kvz_cabac_write_coeff_remain_encry(struct encoder_state_t * const state, cabac_data_t * const cabac, const uint32_t symbol,
View file
kvazaar-1.0.0.tar.gz/src/cfg.c -> kvazaar-1.1.0.tar.gz/src/cfg.c
Changed
@@ -20,6 +20,7 @@ #include "cfg.h" +#include <limits.h> #include <stdio.h> #include <stdlib.h> #include <string.h> @@ -27,15 +28,7 @@ kvz_config *kvz_config_alloc(void) { - kvz_config *cfg = (kvz_config *)malloc(sizeof(kvz_config)); - if (!cfg) { - fprintf(stderr, "Failed to allocate a config object!\n"); - return cfg; - } - - FILL(*cfg, 0); - - return cfg; + return calloc(1, sizeof(kvz_config)); } int kvz_config_init(kvz_config *cfg) @@ -76,7 +69,7 @@ cfg->vui.chroma_loc = 0; /* left center */ cfg->aud_enable = 0; cfg->cqmfile = NULL; - cfg->ref_frames = DEFAULT_REF_PIC_COUNT; + cfg->ref_frames = 1; cfg->gop_len = 4; cfg->gop_lowdelay = true; cfg->bipred = 0; @@ -122,6 +115,12 @@ cfg->gop_lp_definition.d = 3; cfg->gop_lp_definition.t = 1; + cfg->roi.width = 0; + cfg->roi.height = 0; + cfg->roi.dqps = NULL; + + cfg->slices = KVZ_SLICES_NONE; + return 1; } @@ -132,6 +131,7 @@ FREE_POINTER(cfg->tiles_width_split); FREE_POINTER(cfg->tiles_height_split); FREE_POINTER(cfg->slice_addresses_in_ts); + FREE_POINTER(cfg->roi.dqps); } free(cfg); @@ -651,22 +651,43 @@ cfg->vui.chroma_loc = atoi(value); else if OPT("aud") cfg->aud_enable = atobool(value); - else if OPT("cqmfile") - cfg->cqmfile = strdup(value); + else if OPT("cqmfile") { + char* cqmfile = strdup(value); + if (!cqmfile) { + fprintf(stderr, "Failed to allocate memory for CQM file name.\n"); + return 0; + } + FREE_POINTER(cfg->cqmfile); + cfg->cqmfile = cqmfile; + } else if OPT("tiles-width-split") { int retval = parse_tiles_specification(value, &cfg->tiles_width_count, &cfg->tiles_width_split); + if (cfg->tiles_width_count > 1 && cfg->tmvp_enable) { cfg->tmvp_enable = false; fprintf(stderr, "Disabling TMVP because tiles are used.\n"); } + + if (cfg->wpp) { + cfg->wpp = false; + fprintf(stderr, "Disabling WPP because tiles were enabled.\n"); + } + return retval; } else if OPT("tiles-height-split") { int retval = parse_tiles_specification(value, &cfg->tiles_height_count, &cfg->tiles_height_split); + if (cfg->tiles_height_count > 1 && cfg->tmvp_enable) { cfg->tmvp_enable = false; fprintf(stderr, "Disabling TMVP because tiles are used.\n"); } + + if (cfg->wpp) { + cfg->wpp = false; + fprintf(stderr, "Disabling WPP because tiles were enabled.\n"); + } + return retval; } else if OPT("tiles") @@ -699,6 +720,11 @@ fprintf(stderr, "Disabling TMVP because tiles are used.\n"); } + if (cfg->wpp) { + cfg->wpp = false; + fprintf(stderr, "Disabling WPP because tiles were enabled.\n"); + } + return 1; } else if OPT("wpp") @@ -709,11 +735,27 @@ // -1 means automatic selection cfg->owf = -1; } - } else if OPT("slice-addresses") { - fprintf(stderr, "--slice-addresses doesn't do anything, because slices are not implemented.\n"); - return parse_slice_specification(value, &cfg->slice_count, &cfg->slice_addresses_in_ts); - } else if OPT("threads") + } else if OPT("slices") { + if (!strcmp(value, "tiles")) { + cfg->slices = KVZ_SLICES_TILES; + return 1; + } else if (!strcmp(value, "wpp")) { + cfg->slices = KVZ_SLICES_WPP; + return 1; + } else if (!strcmp(value, "tiles+wpp")) { + cfg->slices = KVZ_SLICES_TILES | KVZ_SLICES_WPP; + return 1; + } else { + return parse_slice_specification(value, &cfg->slice_count, &cfg->slice_addresses_in_ts); + } + + } else if OPT("threads") { cfg->threads = atoi(value); + if (cfg->threads == 0 && !strcmp(value, "auto")) { + // -1 means automatic selection + cfg->threads = -1; + } + } else if OPT("cpuid") cfg->cpuid = atoi(value); else if OPT("pu-depth-inter") @@ -788,6 +830,12 @@ cfg->gop[7].poc_offset = 7; cfg->gop[7].qp_offset = 4; cfg->gop[7].layer = 4; cfg->gop[7].qp_factor = 0.68; cfg->gop[7].is_ref = 0; cfg->gop[7].ref_neg_count = 3; cfg->gop[7].ref_neg[0] = 1; cfg->gop[7].ref_neg[1] = 3; cfg->gop[7].ref_neg[2] = 7; cfg->gop[7].ref_pos_count = 1; cfg->gop[7].ref_pos[0] = 1; + } else if (atoi(value) == 0) { + //Disable gop + cfg->gop_len = 0; + cfg->gop_lowdelay = 0; + cfg->gop_lp_definition.d = 0; + cfg->gop_lp_definition.t = 0; } else if (atoi(value)) { fprintf(stderr, "Input error: unsupported gop length, must be 0 or 8\n"); return 0; @@ -907,10 +955,6 @@ cfg->lossless = (bool)atobool(value); else if OPT("tmvp") { cfg->tmvp_enable = atobool(value); - if (cfg->gop_len && cfg->tmvp_enable) { - fprintf(stderr, "Cannot enable TMVP because GOP is used.\n"); - cfg->tmvp_enable = false; - } if (cfg->tiles_width_count > 1 || cfg->tiles_height_count > 1) { fprintf(stderr, "Cannot enable TMVP because tiles are used.\n"); cfg->tmvp_enable = false; @@ -947,6 +991,60 @@ } else if OPT("implicit-rdpcm") cfg->implicit_rdpcm = (bool)atobool(value); + else if OPT("roi") { + // The ROI description is as follows: + // First number is width, second number is height, + // then follows width * height number of dqp values. + FILE* f = fopen(value, "rb"); + if (!f) { + fprintf(stderr, "Could not open ROI file.\n"); + return 0; + } + + int width = 0; + int height = 0; + if (!fscanf(f, "%d", &width) || !fscanf(f, "%d", &height)) { + fprintf(stderr, "Failed to read ROI size.\n"); + fclose(f); + return 0; + } + + if (width <= 0 || height <= 0) { + fprintf(stderr, "Invalid ROI size: %dx%d.\n", width, height); + fclose(f); + return 0; + } + + if (width > 10000 || height > 10000) { + fprintf(stderr, "ROI dimensions exceed arbitrary value of 10000.\n"); + return 0;
View file
kvazaar-1.0.0.tar.gz/src/cli.c -> kvazaar-1.1.0.tar.gz/src/cli.c
Changed
@@ -84,7 +84,7 @@ { "wpp", no_argument, NULL, 0 }, { "no-wpp", no_argument, NULL, 0 }, { "owf", required_argument, NULL, 0 }, - { "slice-addresses", required_argument, NULL, 0 }, + { "slices", required_argument, NULL, 0 }, { "threads", required_argument, NULL, 0 }, { "cpuid", required_argument, NULL, 0 }, { "pu-depth-inter", required_argument, NULL, 0 }, @@ -118,6 +118,7 @@ { "input-format", required_argument, NULL, 0 }, { "implicit-rdpcm", no_argument, NULL, 0 }, { "no-implicit-rdpcm", no_argument, NULL, 0 }, + { "roi", required_argument, NULL, 0 }, {0, 0, 0, 0} }; @@ -312,153 +313,185 @@ "Usage:\n" "kvazaar -i <input> --input-res <width>x<height> -o <output>\n" "\n" - "Optional parameters:\n" - " --help : Print this help message and exit\n" - " --version : Print version information and exit\n" - " -n, --frames <integer> : Number of frames to code [all]\n" - " --seek <integer> : First frame to code [0]\n" - " --input-res <int>x<int> : Input resolution (width x height) or\n" - " auto : try to detect from file name [auto]\n" - " --input-fps <num>/<denom> : Framerate of the input video [25.0]\n" - " -q, --qp <integer> : Quantization Parameter [32]\n" - " -p, --period <integer> : Period of intra pictures [0]\n" - " 0: only first picture is intra\n" - " 1: all pictures are intra\n" - " 2-N: every Nth picture is intra\n" - " --vps-period <integer> : Specify how often the video parameter set is\n" - " re-sent. [0]\n" - " 0: only send VPS with the first frame\n" - " 1: send VPS with every intra frame\n" - " N: send VPS with every Nth intra frame\n" - " -r, --ref <integer> : Reference frames, range 1..15 [3]\n" - " --no-deblock : Disable deblocking filter\n" - " --deblock <beta:tc> : Deblocking filter parameters\n" - " beta and tc range is -6..6 [0:0]\n" - " --no-sao : Disable sample adaptive offset\n" - " --no-rdoq : Disable RDO quantization\n" - " --no-signhide : Disable sign hiding in quantization\n" - " --smp : Enable Symmetric Motion Partition\n" - " --amp : Enable Asymmetric Motion Partition\n" - " --rd <integer> : Rate-Distortion Optimization level [1]\n" - " 0: no RDO\n" - " 1: estimated RDO\n" - " 2: full RDO\n" - " --mv-rdo : Enable Rate-Distortion Optimized motion vector costs\n" - " --full-intra-search : Try all intra modes.\n" - " --no-transform-skip : Disable transform skip\n" - " --aud : Use access unit delimiters\n" - " --cqmfile <string> : Custom Quantization Matrices from a file\n" - " --debug <string> : Output encoders reconstruction.\n" - " --cpuid <integer> : Disable runtime cpu optimizations with value 0.\n" - " --me <string> : Set integer motion estimation algorithm [\"hexbs\"]\n" - " \"hexbs\": Hexagon Based Search (faster)\n" - " \"tz\": Test Zone Search (better quality)\n" - " \"full\": Full Search (super slow)\n" - " --subme <integer> : Set fractional pixel motion estimation level [4].\n" - " 0: only integer motion estimation\n" - " 1: + 1/2-pixel horizontal and vertical\n" - " 2: + 1/2-pixel diagonal\n" - " 3: + 1/4-pixel horizontal and vertical\n" - " 4: + 1/4-pixel diagonal\n" - " --source-scan-type <string> : Set source scan type [\"progressive\"].\n" - " \"progressive\": progressive scan\n" - " \"tff\": top field first\n" - " \"bff\": bottom field first\n" - " --pu-depth-inter <int>-<int> : Range for sizes of inter prediction units to try.\n" - " 0: 64x64, 1: 32x32, 2: 16x16, 3: 8x8\n" - " --pu-depth-intra <int>-<int> : Range for sizes of intra prediction units to try.\n" - " 0: 64x64, 1: 32x32, 2: 16x16, 3: 8x8, 4: 4x4\n" - " --no-info : Don't add information about the encoder to settings.\n" - " --gop <string> : Definition of GOP structure [0]\n" - " \"0\": disabled\n" - " \"8\": B-frame pyramid of length 8\n" - " \"lp-<string>\": lp-gop definition (e.g. lp-g8d4r3t2)\n" - " --bipred : Enable bi-prediction search\n" - " --bitrate <integer> : Target bitrate. [0]\n" - " 0: disable rate-control\n" - " N: target N bits per second\n" - " --preset <string> : Use preset. This will override previous options.\n" - " ultrafast, superfast, veryfast, faster,\n" - " fast, medium, slow, slower, veryslow, placebo\n" - " --no-psnr : Don't calculate PSNR for frames\n" - " --loop-input : Re-read input file forever\n" - " --mv-constraint : Constrain movement vectors\n" - " \"none\": no constraint\n" - " \"frametile\": constrain within the tile\n" - " \"frametilemargin\": constrain even more\n" - " --hash : Specify which decoded picture hash to use [checksum]\n" - " \"none\": 0 bytes\n" - " \"checksum\": 18 bytes\n" - " \"md5\": 56 bytes\n" - " --cu-split-termination : Specify the cu split termination behaviour\n" - " \"zero\": Terminate when splitting gives little\n" - " improvement.\n" - " \"off\": Don't terminate splitting early\n" - " --me-early-termination : Specify the me early termination behaviour\n" - " \"off\": Early termination is off\n" - " \"on\": Early termination is on\n" - " \"sensitive\": Sensitive early termination is on\n" - " --lossless : Use lossless coding\n" - " --implicit-rdpcm : Enable implicit residual DPCM. Currently only supported\n" - " with lossless coding.\n" - " --no-tmvp : Disable Temporal Motion Vector Prediction\n" - " --rdoq-skip : Skips RDOQ for 4x4 blocks\n" - " --input-format : P420 or P400\n" - " --input-bitdepth : 8-16\n" + /* Word wrap to this width to stay under 80 characters (including ") ************/ + "Required:\n" + " -i, --input : Input file\n" + " --input-res <res> : Input resolution [auto]\n" + " auto: detect from file name\n" + " <int>x<int>: width times height\n" + " -o, --output : Output file\n" "\n" - " Video Usability Information:\n" - " --sar <width:height> : Specify Sample Aspect Ratio\n" - " --overscan <string> : Specify crop overscan setting [\"undef\"]\n" - " - undef, show, crop\n" - " --videoformat <string> : Specify video format [\"undef\"]\n" - " - component, pal, ntsc, secam, mac, undef\n" - " --range <string> : Specify color range [\"tv\"]\n" - " - tv, pc\n" - " --colorprim <string> : Specify color primaries [\"undef\"]\n" - " - undef, bt709, bt470m, bt470bg,\n" - " smpte170m, smpte240m, film, bt2020\n" - " --transfer <string> : Specify transfer characteristics [\"undef\"]\n" - " - undef, bt709, bt470m, bt470bg,\n" - " smpte170m, smpte240m, linear, log100,\n" - " log316, iec61966-2-4, bt1361e,\n" - " iec61966-2-1, bt2020-10, bt2020-12\n" - " --colormatrix <string> : Specify color matrix setting [\"undef\"]\n" - " - undef, bt709, fcc, bt470bg, smpte170m,\n" - " smpte240m, GBR, YCgCo, bt2020nc, bt2020c\n" - " --chromaloc <integer> : Specify chroma sample location (0 to 5) [0]\n" + /* Word wrap to this width to stay under 80 characters (including ") ************/ + "Presets:\n" + " --preset=<preset> : Set options to a preset [medium]\n" + " - ultrafast, superfast, veryfast, faster,\n" + " fast, medium, slow, slower, veryslow\n" + " placebo\n" "\n" - " Parallel processing:\n" - " --threads <integer> : Maximum number of threads to use.\n" - " Disable threads if set to 0.\n" + /* Word wrap to this width to stay under 80 characters (including ") ************/ + "Input:\n" + " -n, --frames <integer> : Number of frames to code [all]\n" + " --seek <integer> : First frame to code [0]\n" + " --input-fps <num>/<denom> : Framerate of the input video [25.0]\n" + " --source-scan-type <string> : Set source scan type [progressive].\n" + " - progressive: progressive scan\n" + " - tff: top field first\n" + " - bff: bottom field first\n" + " --input-format : P420 or P400\n" + " --input-bitdepth : 8-16\n" + " --loop-input : Re-read input file forever\n" "\n" - " Tiles:\n" - " --tiles <int>x<int> : Split picture into width x height uniform tiles.\n" - " --tiles-width-split <string>|u<int> :\n" - " Specifies a comma separated list of pixel\n" - " positions of tiles columns separation coordinates.\n" - " Can also be u followed by and a single int n,\n" - " in which case it produces columns of uniform width.\n" - " --tiles-height-split <string>|u<int> :\n" - " Specifies a comma separated list of pixel\n" - " positions of tiles rows separation coordinates.\n" - " Can also be u followed by and a single int n,\n" - " in which case it produces rows of uniform height.\n" + /* Word wrap to this width to stay under 80 characters (including ") ************/ + "Options:\n" + " --help : Print this help message and exit\n" + " --version : Print version information and exit\n" + " --aud : Use access unit delimiters\n" + " --debug <string> : Output encoders reconstruction.\n" + " --cpuid <integer> : Disable runtime cpu optimizations with value 0.\n" + " --hash : Decoded picture hash [checksum]\n" + " - none: 0 bytes\n" + " - checksum: 18 bytes\n" + " - md5: 56 bytes\n" + " --no-psnr : Don't calculate PSNR for frames\n" + " --no-info : Don't add encoder info SEI.\n" "\n" - " Wpp:\n" - " --wpp : Enable wavefront parallel processing\n" - " --owf <integer>|auto : Number of parallel frames to process. 0 to disable.\n" + /* Word wrap to this width to stay under 80 characters (including ") ************/ + "Video structure:\n" + " -q, --qp <integer> : Quantization Parameter [32]\n" + " -p, --period <integer> : Period of intra pictures [0]\n" + " - 0: only first picture is intra\n" + " - 1: all pictures are intra\n"
View file
kvazaar-1.0.0.tar.gz/src/context.c -> kvazaar-1.1.0.tar.gz/src/context.c
Changed
@@ -121,6 +121,12 @@ { 111, 141, CNU, CNU, 94, 138, 182, 154 }, }; +static const uint8_t INIT_CU_QP_DELTA_ABS[3][2] = { + { 154, 154 }, + { 154, 154 }, + { 154, 154 }, +}; + static const uint8_t INIT_SIG_CG_FLAG[3][4] = { { 121, 140, 61, 154 }, { 121, 140, 61, 154 }, @@ -243,6 +249,9 @@ kvz_ctx_init(&cabac->ctx.mvp_idx_model[0], QP, INIT_MVP_IDX[slice][0]); kvz_ctx_init(&cabac->ctx.mvp_idx_model[1], QP, INIT_MVP_IDX[slice][1]); + kvz_ctx_init(&cabac->ctx.cu_qp_delta_abs[0], QP, INIT_CU_QP_DELTA_ABS[slice][0]); + kvz_ctx_init(&cabac->ctx.cu_qp_delta_abs[1], QP, INIT_CU_QP_DELTA_ABS[slice][1]); + for (i = 0; i < 4; i++) { kvz_ctx_init(&cabac->ctx.cu_sig_coeff_group_model[i], QP, INIT_SIG_CG_FLAG[slice][i]); kvz_ctx_init(&cabac->ctx.cu_abs_model_luma[i], QP, INIT_ABS_FLAG[slice][i]);
View file
kvazaar-1.0.0.tar.gz/src/cu.h -> kvazaar-1.1.0.tar.gz/src/cu.h
Changed
@@ -126,6 +126,13 @@ uint16_t cbf; + /** + * \brief QP used for the CU. + * + * This is required for deblocking when per-LCU QPs are enabled. + */ + uint8_t qp; + union { struct { int8_t mode;
View file
kvazaar-1.0.0.tar.gz/src/encmain.c -> kvazaar-1.1.0.tar.gz/src/encmain.c
Changed
@@ -190,10 +190,13 @@ goto done; } + // Set PTS to make sure we pass it on correctly. + frame_in->pts = frames_read; + bool read_success = yuv_io_read(args->input, args->opts->config->width, args->opts->config->height, - args->encoder->cfg->input_bitdepth, + args->encoder->cfg.input_bitdepth, args->encoder->bitdepth, frame_in); if (!read_success) { @@ -212,7 +215,7 @@ bool read_success = yuv_io_read(args->input, args->opts->config->width, args->opts->config->height, - args->encoder->cfg->input_bitdepth, + args->encoder->cfg.input_bitdepth, args->encoder->bitdepth, frame_in); if (!read_success) { @@ -233,9 +236,9 @@ frames_read++; - if (args->encoder->cfg->source_scan_type != 0) { + if (args->encoder->cfg.source_scan_type != 0) { // Set source scan type for frame, so that it will be turned into fields. - frame_in->interlacing = args->encoder->cfg->source_scan_type; + frame_in->interlacing = args->encoder->cfg.source_scan_type; } // Wait until main thread is ready to receive the next frame. @@ -349,8 +352,8 @@ goto exit_failure; } - encoder_control_t *encoder = enc->control; - + const encoder_control_t *encoder = enc->control; + fprintf(stderr, "Input: %s, output: %s\n", opts->input, opts->output); fprintf(stderr, " Video size: %dx%d (input=%dx%d)\n", encoder->in.width, encoder->in.height, @@ -472,7 +475,7 @@ // Compute and print stats. double frame_psnr[3] = { 0.0, 0.0, 0.0 }; - if (encoder->cfg->calc_psnr && encoder->cfg->source_scan_type == KVZ_INTERLACING_NONE) { + if (encoder->cfg.calc_psnr && encoder->cfg.source_scan_type == KVZ_INTERLACING_NONE) { // Do not compute PSNR for interlaced frames, because img_rec does not contain // the deinterlaced frame yet. compute_psnr(img_src, img_rec, frame_psnr);
View file
kvazaar-1.0.0.tar.gz/src/encode_coding_tree.c -> kvazaar-1.1.0.tar.gz/src/encode_coding_tree.c
Changed
@@ -117,7 +117,7 @@ int32_t i; uint32_t sig_coeffgroup_flag[8 * 8] = { 0 }; - int8_t be_valid = encoder->sign_hiding; + int8_t be_valid = encoder->cfg.signhide_enable; int32_t scan_pos_sig; uint32_t go_rice_param = 0; uint32_t blk_pos, pos_y, pos_x, sig, ctx_sig; @@ -174,7 +174,7 @@ int pos_last = scan[scan_pos_last]; // transform skip flag - if(width == 4 && encoder->trskip_enable) { + if(width == 4 && encoder->cfg.trskip_enable) { cabac->cur_ctx = (type == 0) ? &(cabac->ctx.transform_skip_model_luma) : &(cabac->ctx.transform_skip_model_chroma); CABAC_BIN(cabac, tr_skip, "transform_skip_flag"); } @@ -256,7 +256,7 @@ if (num_non_zero > 0) { bool sign_hidden = last_nz_pos_in_cg - first_nz_pos_in_cg >= 4 /* SBH_THRESHOLD */ - && !encoder->cfg->lossless; + && !encoder->cfg.lossless; uint32_t ctx_set = (i > 0 && type == 0) ? 2 : 0; cabac_ctx_t *base_ctx_mod; int32_t num_c1_flag, first_c2_flag_idx, idx, first_coeff2; @@ -301,13 +301,13 @@ if (be_valid && sign_hidden) { coeff_signs = coeff_signs >> 1; if(!state->cabac.only_count) - if (state->encoder_control->cfg->crypto_features & KVZ_CRYPTO_TRANSF_COEFF_SIGNS) { + if (state->encoder_control->cfg.crypto_features & KVZ_CRYPTO_TRANSF_COEFF_SIGNS) { coeff_signs = coeff_signs ^ ff_get_key(&state->tile->dbs_g, num_non_zero-1); } CABAC_BINS_EP(cabac, coeff_signs , (num_non_zero - 1), "coeff_sign_flag"); } else { if(!state->cabac.only_count) - if (state->encoder_control->cfg->crypto_features & KVZ_CRYPTO_TRANSF_COEFF_SIGNS) + if (state->encoder_control->cfg.crypto_features & KVZ_CRYPTO_TRANSF_COEFF_SIGNS) coeff_signs = coeff_signs ^ ff_get_key(&state->tile->dbs_g, num_non_zero); CABAC_BINS_EP(cabac, coeff_signs, num_non_zero, "coeff_sign_flag"); } @@ -320,7 +320,7 @@ if (abs_coeff[idx] >= base_level) { if(!state->cabac.only_count) { - if (state->encoder_control->cfg->crypto_features & KVZ_CRYPTO_TRANSF_COEFFS) + if (state->encoder_control->cfg.crypto_features & KVZ_CRYPTO_TRANSF_COEFFS) kvz_cabac_write_coeff_remain_encry(state, cabac, abs_coeff[idx] - base_level, go_rice_param, base_level); else kvz_cabac_write_coeff_remain(cabac, abs_coeff[idx] - base_level, go_rice_param); @@ -459,7 +459,7 @@ int intra_split_flag = (cur_cu->type == CU_INTRA && cur_cu->part_size == SIZE_NxN); // The implicit split by intra NxN is not counted towards max_tr_depth. - int tr_depth_intra = state->encoder_control->tr_depth_intra; + int tr_depth_intra = state->encoder_control->cfg.tr_depth_intra; int max_tr_depth = (cur_cu->type == CU_INTRA ? tr_depth_intra + intra_split_flag : TR_DEPTH_INTER); int8_t split = (cur_cu->tr_depth > depth); @@ -517,6 +517,28 @@ } if (cb_flag_y | cb_flag_u | cb_flag_v) { + if (state->must_code_qp_delta) { + const int qp_delta = state->qp - state->ref_qp; + const int qp_delta_abs = ABS(qp_delta); + cabac_data_t* cabac = &state->cabac; + + // cu_qp_delta_abs prefix + cabac->cur_ctx = &cabac->ctx.cu_qp_delta_abs[0]; + kvz_cabac_write_unary_max_symbol(cabac, cabac->ctx.cu_qp_delta_abs, MIN(qp_delta_abs, 5), 1, 5); + + if (qp_delta_abs >= 5) { + // cu_qp_delta_abs suffix + kvz_cabac_write_ep_ex_golomb(state, cabac, qp_delta_abs - 5, 0); + } + + if (qp_delta != 0) { + CABAC_BIN_EP(cabac, (qp_delta >= 0 ? 0 : 1), "qp_delta_sign_flag"); + } + + state->must_code_qp_delta = false; + state->ref_qp = state->qp; + } + encode_transform_unit(state, x_pu, y_pu, depth); } } @@ -645,7 +667,7 @@ } uint32_t mvd_hor_sign = (mvd_hor>0)?0:1; if(!state->cabac.only_count) - if (state->encoder_control->cfg->crypto_features & KVZ_CRYPTO_MV_SIGNS) + if (state->encoder_control->cfg.crypto_features & KVZ_CRYPTO_MV_SIGNS) mvd_hor_sign = mvd_hor_sign^ff_get_key(&state->tile->dbs_g, 1); CABAC_BIN_EP(cabac, mvd_hor_sign, "mvd_sign_flag_hor"); } @@ -655,7 +677,7 @@ } uint32_t mvd_ver_sign = (mvd_ver>0)?0:1; if(!state->cabac.only_count) - if (state->encoder_control->cfg->crypto_features & KVZ_CRYPTO_MV_SIGNS) + if (state->encoder_control->cfg.crypto_features & KVZ_CRYPTO_MV_SIGNS) mvd_ver_sign = mvd_ver_sign^ff_get_key(&state->tile->dbs_g, 1); CABAC_BIN_EP(cabac, mvd_ver_sign, "mvd_sign_flag_ver"); } @@ -873,7 +895,7 @@ CABAC_BIN(cabac, 0, "part_mode horizontal"); } - if (state->encoder_control->cfg->amp_enable && depth < MAX_DEPTH) { + if (state->encoder_control->cfg.amp_enable && depth < MAX_DEPTH) { cabac->cur_ctx = &(cabac->ctx.part_size_model[3]); if (cur_cu->part_size == SIZE_2NxN || @@ -894,14 +916,16 @@ } void kvz_encode_coding_tree(encoder_state_t * const state, - uint16_t x_ctb, uint16_t y_ctb, uint8_t depth) + uint16_t x_ctb, + uint16_t y_ctb, + uint8_t depth) { cabac_data_t * const cabac = &state->cabac; const videoframe_t * const frame = state->tile->frame; const cu_info_t *cur_cu = kvz_videoframe_get_cu_const(frame, x_ctb, y_ctb); uint8_t split_flag = GET_SPLITDATA(cur_cu, depth); uint8_t split_model = 0; - + //Absolute ctb uint16_t abs_x_ctb = x_ctb + (state->tile->lcu_offset_x * LCU_WIDTH) / (LCU_WIDTH >> MAX_DEPTH); uint16_t abs_y_ctb = y_ctb + (state->tile->lcu_offset_y * LCU_WIDTH) / (LCU_WIDTH >> MAX_DEPTH); @@ -949,7 +973,7 @@ } } - if (state->encoder_control->cfg->lossless) { + if (state->encoder_control->cfg.lossless) { cabac->cur_ctx = &cabac->ctx.cu_transquant_bypass; CABAC_BIN(cabac, 1, "cu_transquant_bypass_flag"); }
View file
kvazaar-1.0.0.tar.gz/src/encode_coding_tree.h -> kvazaar-1.1.0.tar.gz/src/encode_coding_tree.h
Changed
@@ -29,7 +29,7 @@ #include "encoderstate.h" #include "global.h" -void kvz_encode_coding_tree(encoder_state_t *state, +void kvz_encode_coding_tree(encoder_state_t * const state, uint16_t x_ctb, uint16_t y_ctb, uint8_t depth);
View file
kvazaar-1.0.0.tar.gz/src/encoder.c -> kvazaar-1.1.0.tar.gz/src/encoder.c
Changed
@@ -90,7 +90,7 @@ threads_per_frame -= 2; } - if (cfg->gop_lowdelay && cfg->gop_lp_definition.t > 1) { + if (cfg->gop_len && cfg->gop_lowdelay && cfg->gop_lp_definition.t > 1) { // Temporal skipping makes every other frame very fast to encode so // more parallel frames should be used. frames *= 2; @@ -120,7 +120,8 @@ * \param cfg encoder configuration * \return initialized encoder control or NULL on failure */ -encoder_control_t* kvz_encoder_control_init(kvz_config *const cfg) { +encoder_control_t* kvz_encoder_control_init(const kvz_config *const cfg) +{ encoder_control_t *encoder = NULL; if (!cfg) { @@ -128,20 +129,6 @@ goto init_failed; } - if (cfg->threads == -1) { - cfg->threads = cfg_num_threads(); - } - - if (cfg->gop_len > 0) { - if (cfg->tmvp_enable) { - cfg->tmvp_enable = false; - fprintf(stderr, "Disabling TMVP because GOP is used.\n"); - } - if (cfg->gop_lowdelay) { - kvz_config_process_lp_gop(cfg); - } - } - // Make sure that the parameters make sense. if (!kvz_config_validate(cfg)) { goto init_failed; @@ -153,73 +140,78 @@ goto init_failed; } + // Take a copy of the config. + memcpy(&encoder->cfg, cfg, sizeof(encoder->cfg)); + // Set fields that are not copied to NULL. + encoder->cfg.cqmfile = NULL; + encoder->cfg.tiles_width_split = NULL; + encoder->cfg.tiles_height_split = NULL; + encoder->cfg.slice_addresses_in_ts = NULL; + + if (encoder->cfg.threads == -1) { + encoder->cfg.threads = cfg_num_threads(); + } + + if (encoder->cfg.gop_len > 0) { + if (encoder->cfg.gop_lowdelay) { + kvz_config_process_lp_gop(&encoder->cfg); + } + } + // Need to set owf before initializing threadqueue. - if (cfg->owf >= 0) { - encoder->owf = cfg->owf; - } else { - encoder->owf = select_owf_auto(cfg); - fprintf(stderr, "--owf=auto value set to %d.\n", encoder->owf); + if (encoder->cfg.owf < 0) { + encoder->cfg.owf = select_owf_auto(&encoder->cfg); + fprintf(stderr, "--owf=auto value set to %d.\n", encoder->cfg.owf); } - if (cfg->source_scan_type != KVZ_INTERLACING_NONE) { + if (encoder->cfg.source_scan_type != KVZ_INTERLACING_NONE) { // If using interlaced coding with OWF, the OWF has to be an even number // to ensure that the pair of fields will be output for the same picture. - if (encoder->owf % 2 == 1) { - encoder->owf += 1; + if (encoder->cfg.owf % 2 == 1) { + encoder->cfg.owf += 1; } } encoder->threadqueue = MALLOC(threadqueue_queue_t, 1); if (!encoder->threadqueue || !kvz_threadqueue_init(encoder->threadqueue, - cfg->threads, - encoder->owf > 0)) { + encoder->cfg.threads, + encoder->cfg.owf > 0)) { fprintf(stderr, "Could not initialize threadqueue.\n"); goto init_failed; } - // Config pointer to config struct - encoder->cfg = cfg; - encoder->bitdepth = KVZ_BIT_DEPTH; - encoder->chroma_format = KVZ_FORMAT2CSP(cfg->input_format); - - // deblocking filter - encoder->deblock_enable = 1; - encoder->beta_offset_div2 = 0; - encoder->tc_offset_div2 = 0; - // SAO - encoder->sao_enable = 1; - // Rate-distortion optimization level - encoder->rdo = 1; - encoder->full_intra_search = 0; + encoder->chroma_format = KVZ_FORMAT2CSP(encoder->cfg.input_format); // Interlacing - encoder->in.source_scan_type = (int8_t)cfg->source_scan_type; - encoder->vui.field_seq_flag = encoder->cfg->source_scan_type != 0; - encoder->vui.frame_field_info_present_flag = encoder->cfg->source_scan_type != 0; + encoder->in.source_scan_type = (int8_t)encoder->cfg.source_scan_type; + encoder->vui.field_seq_flag = encoder->cfg.source_scan_type != 0; + encoder->vui.frame_field_info_present_flag = encoder->cfg.source_scan_type != 0; // Initialize the scaling list kvz_scalinglist_init(&encoder->scaling_list); // CQM - { - FILE* cqmfile; - cqmfile = cfg->cqmfile ? fopen(cfg->cqmfile, "rb") : NULL; + if (cfg->cqmfile) { + FILE* cqmfile = fopen(cfg->cqmfile, "rb"); if (cqmfile) { kvz_scalinglist_parse(&encoder->scaling_list, cqmfile); fclose(cqmfile); + } else { + fprintf(stderr, "Could not open CQM file.\n"); + goto init_failed; } } kvz_scalinglist_process(&encoder->scaling_list, encoder->bitdepth); - - kvz_encoder_control_input_init(encoder, cfg->width, cfg->height); - if (cfg->framerate_num != 0) { - double framerate = cfg->framerate_num / (double)cfg->framerate_denom; - encoder->target_avg_bppic = cfg->target_bitrate / (framerate); + kvz_encoder_control_input_init(encoder, encoder->cfg.width, encoder->cfg.height); + + if (encoder->cfg.framerate_num != 0) { + double framerate = encoder->cfg.framerate_num / (double)encoder->cfg.framerate_denom; + encoder->target_avg_bppic = encoder->cfg.target_bitrate / framerate; } else { - encoder->target_avg_bppic = cfg->target_bitrate / cfg->framerate; + encoder->target_avg_bppic = encoder->cfg.target_bitrate / encoder->cfg.framerate; } encoder->target_avg_bpp = encoder->target_avg_bppic / encoder->in.pixels_per_pic; @@ -227,23 +219,30 @@ goto init_failed; } + // Copy delta QP array for ROI coding. + if (cfg->roi.dqps) { + const size_t roi_size = encoder->cfg.roi.width * encoder->cfg.roi.height; + encoder->cfg.roi.dqps = calloc(roi_size, sizeof(cfg->roi.dqps[0])); + memcpy(encoder->cfg.roi.dqps, + cfg->roi.dqps, + roi_size * sizeof(*cfg->roi.dqps)); + } + //Tiles - encoder->tiles_enable = encoder->cfg->tiles_width_count > 1 || - encoder->cfg->tiles_height_count > 1; + encoder->tiles_enable = encoder->cfg.tiles_width_count > 1 || + encoder->cfg.tiles_height_count > 1; { - int i, j; //iteration variables const int num_ctbs = encoder->in.width_in_lcu * encoder->in.height_in_lcu; - int tileIdx, x, y; //iterations variable for 6-9 //Temporary pointers to allow encoder fields to be const int32_t *tiles_col_width, *tiles_row_height, *tiles_ctb_addr_rs_to_ts, *tiles_ctb_addr_ts_to_rs, *tiles_tile_id, *tiles_col_bd, *tiles_row_bd; - if (encoder->cfg->tiles_width_count > encoder->in.width_in_lcu) { + if (encoder->cfg.tiles_width_count > encoder->in.width_in_lcu) { fprintf(stderr, "Too many tiles (width)!\n"); goto init_failed; - } else if (encoder->cfg->tiles_height_count > encoder->in.height_in_lcu) { + } else if (encoder->cfg.tiles_height_count > encoder->in.height_in_lcu) { fprintf(stderr, "Too many tiles (height)!\n"); goto init_failed; } @@ -251,19 +250,15 @@ //Will be (perhaps) changed later encoder->tiles_uniform_spacing_flag = 1; - //tilesn[x,y] contains the number of _separation_ between tiles, whereas the encoder needs the number of tiles. - encoder->tiles_num_tile_columns = encoder->cfg->tiles_width_count; - encoder->tiles_num_tile_rows = encoder->cfg->tiles_height_count; -
View file
kvazaar-1.0.0.tar.gz/src/encoder.h -> kvazaar-1.1.0.tar.gz/src/encoder.h
Changed
@@ -35,9 +35,19 @@ /* Encoder control options, the main struct */ typedef struct encoder_control_t { - /* Configuration */ - const kvz_config *cfg; - + /** + * \brief Configuration. + * + * NOTE: The following fields are not copied from the config passed to + * kvz_encoder_control_init and must not be accessed: + * - cqmfile + * - tiles_width_split + * - tiles_height_split + * - slice_addresses_in_ts + * Use appropriate fields in encoder_control_t instead. + */ + kvz_config cfg; + /* Input */ struct { int32_t width; @@ -49,31 +59,17 @@ int64_t pixels_per_pic; int8_t source_scan_type; } in; - + /* TODO: add ME data */ struct { void(*IME)(); void(*FME)(); int range; } me; - + int8_t bitdepth; enum kvz_chroma_format chroma_format; - int8_t tr_depth_intra; - - int8_t fme_level; - - /* Filtering */ - int8_t deblock_enable; // \brief Flag to enable deblocking filter - int8_t sao_enable; // \brief Flag to enable sample adaptive offset filter - int8_t rdoq_enable; // \brief Whether RDOQ is enabled or not. - int8_t rdo; // \brief RDO level - int8_t full_intra_search; // \brief Whether to skip intra modes during search. - int8_t trskip_enable; // \brief Flag to enable transform skipping (4x4 intra) - int8_t beta_offset_div2; // \brief (deblocking) beta offset (div 2), range -6...6 - int8_t tc_offset_div2; // \brief (deblocking)tc offset (div 2), range -6...6 - /* VUI */ struct { @@ -81,70 +77,37 @@ int32_t num_units_in_tick; /*!< \brief Timing scale numerator */ int32_t time_scale; /*!< \brief Timing scale denominator */ - int16_t sar_width; - int16_t sar_height; - int8_t overscan; - int8_t videoformat; - int8_t fullrange; - int8_t colorprim; - int8_t transfer; - int8_t colormatrix; - int8_t chroma_loc; - int8_t field_seq_flag; int8_t frame_field_info_present_flag; int8_t timing_info_present_flag; } vui; - int8_t aud_enable; - //scaling list scaling_list_t scaling_list; - + //spec: references to variables defined in Rec. ITU-T H.265 (04/2013) int8_t tiles_enable; /*!<spec: tiles_enabled */ - + int8_t tiles_uniform_spacing_flag; /*!<spec: uniform_spacing_flag */ - - uint8_t tiles_num_tile_columns; /*!<spec: num_tile_columns_minus1 + 1 */ - uint8_t tiles_num_tile_rows; /*!<spec: num_tile_rows_minus1 + 1*/ - + const int32_t *tiles_col_width; /*!<spec: colWidth (6.5.1); dimension: tiles_num_tile_columns */ const int32_t *tiles_row_height; /*!<spec: rowHeight (6.5.1); dimension: tiles_num_tile_rows */ - + const int32_t *tiles_col_bd; /*!<spec: colBd (6.5.1); dimension: tiles_num_tile_columns + 1 */ const int32_t *tiles_row_bd; /*!<spec: rowBd (6.5.1); dimension: tiles_num_tile_rows + 1 */ - + //PicSizeInCtbsY = height_in_lcu * width_in_lcu const int32_t *tiles_ctb_addr_rs_to_ts; /*!<spec: CtbAddrRsToTs (6.5.1); dimension: PicSizeInCtbsY */ const int32_t *tiles_ctb_addr_ts_to_rs; /*!<spec: CtbAddrTsToRs (6.5.1); dimension: PicSizeInCtbsY */ - + const int32_t *tiles_tile_id; /*!<spec: TileId (6.5.1); dimension: PicSizeInCtbsY */ - - //WPP - int wpp; - - //OWF 0 = no owf, 1 = 1 frame, 2 = 2 frames, etc. - int owf; - + //Slices int slice_count; const int* slice_addresses_in_ts; - - threadqueue_queue_t *threadqueue; - struct { - uint8_t min; - uint8_t max; - } pu_depth_inter, pu_depth_intra; - - // How often Video Parameter Set is re-sent. - int32_t vps_period; - - bool sign_hiding; - - bool implicit_rdpcm; + threadqueue_queue_t *threadqueue; //! Target average bits per picture. double target_avg_bppic; @@ -155,9 +118,14 @@ //! Picture weights when GOP is used. double gop_layer_weights[MAX_GOP_LAYERS]; + //! pic_parameter_set + struct { + uint8_t dependent_slice_segments_enabled_flag; + } pps; + } encoder_control_t; -encoder_control_t* kvz_encoder_control_init(kvz_config *cfg); +encoder_control_t* kvz_encoder_control_init(const kvz_config *cfg); void kvz_encoder_control_free(encoder_control_t *encoder); void kvz_encoder_control_input_init(encoder_control_t *encoder, int32_t width, int32_t height);
View file
kvazaar-1.0.0.tar.gz/src/encoder_state-bitstream.c -> kvazaar-1.1.0.tar.gz/src/encoder_state-bitstream.c
Changed
@@ -195,7 +195,7 @@ #ifdef KVZ_DEBUG printf("=========== VUI Set ID: 0 ===========\n"); #endif - if (encoder->vui.sar_width > 0 && encoder->vui.sar_height > 0) { + if (encoder->cfg.vui.sar_width > 0 && encoder->cfg.vui.sar_height > 0) { int i; static const struct { @@ -213,16 +213,16 @@ }; for (i = 0; sar[i].idc != 255; i++) - if (sar[i].width == encoder->vui.sar_width && - sar[i].height == encoder->vui.sar_height) + if (sar[i].width == encoder->cfg.vui.sar_width && + sar[i].height == encoder->cfg.vui.sar_height) break; WRITE_U(stream, 1, 1, "aspect_ratio_info_present_flag"); WRITE_U(stream, sar[i].idc, 8, "aspect_ratio_idc"); if (sar[i].idc == 255) { // EXTENDED_SAR - WRITE_U(stream, encoder->vui.sar_width, 16, "sar_width"); - WRITE_U(stream, encoder->vui.sar_height, 16, "sar_height"); + WRITE_U(stream, encoder->cfg.vui.sar_width, 16, "sar_width"); + WRITE_U(stream, encoder->cfg.vui.sar_height, 16, "sar_height"); } } else WRITE_U(stream, 0, 1, "aspect_ratio_info_present_flag"); @@ -230,28 +230,31 @@ //IF aspect ratio info //ENDIF - if (encoder->vui.overscan > 0) { + if (encoder->cfg.vui.overscan > 0) { WRITE_U(stream, 1, 1, "overscan_info_present_flag"); - WRITE_U(stream, encoder->vui.overscan - 1, 1, "overscan_appropriate_flag"); + WRITE_U(stream, encoder->cfg.vui.overscan - 1, 1, "overscan_appropriate_flag"); } else WRITE_U(stream, 0, 1, "overscan_info_present_flag"); //IF overscan info //ENDIF - if (encoder->vui.videoformat != 5 || encoder->vui.fullrange || - encoder->vui.colorprim != 2 || encoder->vui.transfer != 2 || - encoder->vui.colormatrix != 2) { + if (encoder->cfg.vui.videoformat != 5 || + encoder->cfg.vui.fullrange != 0 || + encoder->cfg.vui.colorprim != 2 || + encoder->cfg.vui.transfer != 2 || + encoder->cfg.vui.colormatrix != 2) { WRITE_U(stream, 1, 1, "video_signal_type_present_flag"); - WRITE_U(stream, encoder->vui.videoformat, 3, "chroma_format"); - WRITE_U(stream, encoder->vui.fullrange, 1, "video_full_range_flag"); + WRITE_U(stream, encoder->cfg.vui.videoformat, 3, "chroma_format"); + WRITE_U(stream, encoder->cfg.vui.fullrange, 1, "video_full_range_flag"); - if (encoder->vui.colorprim != 2 || encoder->vui.transfer != 2 || - encoder->vui.colormatrix != 2) { + if (encoder->cfg.vui.colorprim != 2 || + encoder->cfg.vui.transfer != 2 || + encoder->cfg.vui.colormatrix != 2) { WRITE_U(stream, 1, 1, "colour_description_present_flag"); - WRITE_U(stream, encoder->vui.colorprim, 8, "colour_primaries"); - WRITE_U(stream, encoder->vui.transfer, 8, "transfer_characteristics"); - WRITE_U(stream, encoder->vui.colormatrix, 8, "matrix_coeffs"); + WRITE_U(stream, encoder->cfg.vui.colorprim, 8, "colour_primaries"); + WRITE_U(stream, encoder->cfg.vui.transfer, 8, "transfer_characteristics"); + WRITE_U(stream, encoder->cfg.vui.colormatrix, 8, "matrix_coeffs"); } else WRITE_U(stream, 0, 1, "colour_description_present_flag"); } else @@ -260,10 +263,10 @@ //IF video type //ENDIF - if (encoder->vui.chroma_loc > 0) { + if (encoder->cfg.vui.chroma_loc > 0) { WRITE_U(stream, 1, 1, "chroma_loc_info_present_flag"); - WRITE_UE(stream, encoder->vui.chroma_loc, "chroma_sample_loc_type_top_field"); - WRITE_UE(stream, encoder->vui.chroma_loc, "chroma_sample_loc_type_bottom_field"); + WRITE_UE(stream, encoder->cfg.vui.chroma_loc, "chroma_sample_loc_type_top_field"); + WRITE_UE(stream, encoder->cfg.vui.chroma_loc, "chroma_sample_loc_type_bottom_field"); } else WRITE_U(stream, 0, 1, "chroma_loc_info_present_flag"); @@ -297,8 +300,8 @@ static void encoder_state_write_bitstream_SPS_extension(bitstream_t *stream, encoder_state_t * const state) { - if (state->encoder_control->cfg->implicit_rdpcm && - state->encoder_control->cfg->lossless) { + const kvz_config *cfg = &state->encoder_control->cfg; + if (cfg->implicit_rdpcm && cfg->lossless) { WRITE_U(stream, 1, 1, "sps_extension_present_flag"); WRITE_U(stream, 1, 1, "sps_range_extension_flag"); @@ -372,12 +375,12 @@ WRITE_U(stream, 0, 1, "sps_sub_layer_ordering_info_present_flag"); //for each layer - if (encoder->cfg->gop_lowdelay) { - WRITE_UE(stream, encoder->cfg->ref_frames, "sps_max_dec_pic_buffering"); + if (encoder->cfg.gop_lowdelay) { + WRITE_UE(stream, encoder->cfg.ref_frames, "sps_max_dec_pic_buffering"); WRITE_UE(stream, 0, "sps_num_reorder_pics"); } else { - WRITE_UE(stream, encoder->cfg->ref_frames + encoder->cfg->gop_len, "sps_max_dec_pic_buffering"); - WRITE_UE(stream, encoder->cfg->gop_len, "sps_num_reorder_pics"); + WRITE_UE(stream, encoder->cfg.ref_frames + encoder->cfg.gop_len, "sps_max_dec_pic_buffering"); + WRITE_UE(stream, encoder->cfg.gop_len, "sps_num_reorder_pics"); } WRITE_UE(stream, 0, "sps_max_latency_increase"); //end for @@ -387,7 +390,7 @@ WRITE_UE(stream, 0, "log2_min_transform_block_size_minus2"); // 4x4 WRITE_UE(stream, 3, "log2_diff_max_min_transform_block_size"); // 4x4...32x32 WRITE_UE(stream, TR_DEPTH_INTER, "max_transform_hierarchy_depth_inter"); - WRITE_UE(stream, encoder->tr_depth_intra, "max_transform_hierarchy_depth_intra"); + WRITE_UE(stream, encoder->cfg.tr_depth_intra, "max_transform_hierarchy_depth_intra"); // scaling list WRITE_U(stream, encoder->scaling_list.enable, 1, "scaling_list_enable_flag"); @@ -396,9 +399,9 @@ encoder_state_write_bitstream_scaling_list(stream, state); } - WRITE_U(stream, (encoder->cfg->amp_enable ? 1 : 0), 1, "amp_enabled_flag"); + WRITE_U(stream, (encoder->cfg.amp_enable ? 1 : 0), 1, "amp_enabled_flag"); - WRITE_U(stream, encoder->sao_enable ? 1 : 0, 1, + WRITE_U(stream, encoder->cfg.sao_enable ? 1 : 0, 1, "sample_adaptive_offset_enabled_flag"); WRITE_U(stream, ENABLE_PCM, 1, "pcm_enabled_flag"); #if ENABLE_PCM == 1 @@ -419,7 +422,7 @@ //IF long_term_ref_pics_present //ENDIF - WRITE_U(stream, state->encoder_control->cfg->tmvp_enable, 1, + WRITE_U(stream, state->encoder_control->cfg.tmvp_enable, 1, "sps_temporal_mvp_enable_flag"); WRITE_U(stream, 0, 1, "sps_strong_intra_smoothing_enable_flag"); WRITE_U(stream, 1, 1, "vui_parameters_present_flag"); @@ -440,20 +443,25 @@ #endif WRITE_UE(stream, 0, "pic_parameter_set_id"); WRITE_UE(stream, 0, "seq_parameter_set_id"); - WRITE_U(stream, 0, 1, "dependent_slice_segments_enabled_flag"); + WRITE_U(stream, encoder->pps.dependent_slice_segments_enabled_flag, 1, "dependent_slice_segments_enabled_flag"); WRITE_U(stream, 0, 1, "output_flag_present_flag"); WRITE_U(stream, 0, 3, "num_extra_slice_header_bits"); - WRITE_U(stream, encoder->sign_hiding, 1, "sign_data_hiding_flag"); + WRITE_U(stream, encoder->cfg.signhide_enable, 1, "sign_data_hiding_flag"); WRITE_U(stream, 0, 1, "cabac_init_present_flag"); WRITE_UE(stream, 0, "num_ref_idx_l0_default_active_minus1"); WRITE_UE(stream, 0, "num_ref_idx_l1_default_active_minus1"); - WRITE_SE(stream, ((int8_t)encoder->cfg->qp) - 26, "pic_init_qp_minus26"); + WRITE_SE(stream, ((int8_t)encoder->cfg.qp) - 26, "pic_init_qp_minus26"); WRITE_U(stream, 0, 1, "constrained_intra_pred_flag"); - WRITE_U(stream, encoder->trskip_enable, 1, "transform_skip_enabled_flag"); + WRITE_U(stream, encoder->cfg.trskip_enable, 1, "transform_skip_enabled_flag"); + + if (encoder->cfg.target_bitrate > 0 || encoder->cfg.roi.dqps != NULL) { + // Use separate QP for each LCU when rate control is enabled. + WRITE_U(stream, 1, 1, "cu_qp_delta_enabled_flag"); + WRITE_UE(stream, 0, "diff_cu_qp_delta_depth"); + } else { WRITE_U(stream, 0, 1, "cu_qp_delta_enabled_flag"); - //if cu_qp_delta_enabled_flag - //WRITE_UE(stream, 0, "diff_cu_qp_delta_depth"); + } //TODO: add QP offsets WRITE_SE(stream, 0, "pps_cb_qp_offset"); @@ -463,23 +471,23 @@ WRITE_U(stream, 0, 1, "weighted_bipred_idc"); //WRITE_U(stream, 0, 1, "dependent_slices_enabled_flag"); - WRITE_U(stream, encoder->cfg->lossless, 1, "transquant_bypass_enable_flag"); + WRITE_U(stream, encoder->cfg.lossless, 1, "transquant_bypass_enable_flag"); WRITE_U(stream, encoder->tiles_enable, 1, "tiles_enabled_flag"); //wavefronts - WRITE_U(stream, encoder->wpp, 1, "entropy_coding_sync_enabled_flag"); + WRITE_U(stream, encoder->cfg.wpp, 1, "entropy_coding_sync_enabled_flag"); if (encoder->tiles_enable) { - WRITE_UE(stream, encoder->tiles_num_tile_columns - 1, "num_tile_columns_minus1"); - WRITE_UE(stream, encoder->tiles_num_tile_rows - 1, "num_tile_rows_minus1"); + WRITE_UE(stream, encoder->cfg.tiles_width_count - 1, "num_tile_columns_minus1"); + WRITE_UE(stream, encoder->cfg.tiles_height_count - 1, "num_tile_rows_minus1"); WRITE_U(stream, encoder->tiles_uniform_spacing_flag, 1, "uniform_spacing_flag"); if (!encoder->tiles_uniform_spacing_flag) { int i; - for (i = 0; i < encoder->tiles_num_tile_columns - 1; ++i) {
View file
kvazaar-1.0.0.tar.gz/src/encoder_state-bitstream.h -> kvazaar-1.1.0.tar.gz/src/encoder_state-bitstream.h
Changed
@@ -35,10 +35,12 @@ struct bitstream_t; -void kvz_encoder_state_write_bitstream_slice_header(struct encoder_state_t * const state); +void kvz_encoder_state_write_bitstream_slice_header( + struct bitstream_t * const stream, + struct encoder_state_t * const state, + bool independent); void kvz_encoder_state_write_bitstream(struct encoder_state_t * const state); void kvz_encoder_state_write_bitstream_leaf(struct encoder_state_t * const state); -void kvz_encoder_state_worker_write_bitstream_leaf(void * opaque); void kvz_encoder_state_worker_write_bitstream(void * opaque); void kvz_encoder_state_write_parameter_sets(struct bitstream_t *stream, struct encoder_state_t * const state);
View file
kvazaar-1.0.0.tar.gz/src/encoder_state-ctors_dtors.c -> kvazaar-1.1.0.tar.gz/src/encoder_state-ctors_dtors.c
Changed
@@ -48,13 +48,23 @@ state->frame->poc = 0; state->frame->total_bits_coded = 0; state->frame->cur_gop_bits_coded = 0; + state->frame->prepared = 0; + state->frame->done = 1; state->frame->rc_alpha = 3.2003; state->frame->rc_beta = -1.367; + + const encoder_control_t * const encoder = state->encoder_control; + const int num_lcus = encoder->in.width_in_lcu * encoder->in.height_in_lcu; + state->frame->lcu_stats = MALLOC(lcu_stats_t, num_lcus); + return 1; } static void encoder_state_config_frame_finalize(encoder_state_t * const state) { + if (state->frame == NULL) return; + kvz_image_list_destroy(state->frame->ref); + FREE_POINTER(state->frame->lcu_stats); } static int encoder_state_config_tile_init(encoder_state_t * const state, @@ -96,15 +106,18 @@ state->tile->hor_buf_search = kvz_yuv_t_alloc(luma_size, chroma_size_hor); state->tile->ver_buf_search = kvz_yuv_t_alloc(luma_size, chroma_size_ver); - if (encoder->sao_enable) { + if (encoder->cfg.sao_enable) { state->tile->hor_buf_before_sao = kvz_yuv_t_alloc(luma_size, chroma_size_hor); } else { state->tile->hor_buf_before_sao = NULL; } - if (encoder->wpp) { + if (encoder->cfg.wpp) { int num_jobs = state->tile->frame->width_in_lcu * state->tile->frame->height_in_lcu; state->tile->wf_jobs = MALLOC(threadqueue_job_t*, num_jobs); + for (int i = 0; i < num_jobs; ++i) { + state->tile->wf_jobs[i] = NULL; + } if (!state->tile->wf_jobs) { printf("Error allocating wf_jobs array!\n"); return 0; @@ -113,17 +126,12 @@ state->tile->wf_jobs = NULL; } state->tile->id = encoder->tiles_tile_id[state->tile->lcu_offset_in_ts]; - - state->tile->dbs_g = NULL; - if (state->encoder_control->cfg->crypto_features) { - state->tile->dbs_g = InitC(); - } - state->tile->m_prev_pos = 0; - return 1; } static void encoder_state_config_tile_finalize(encoder_state_t * const state) { + if (state->tile == NULL) return; + if (state->tile->hor_buf_before_sao) kvz_yuv_t_free(state->tile->hor_buf_before_sao); kvz_yuv_t_free(state->tile->hor_buf_search); @@ -131,7 +139,7 @@ kvz_videoframe_free(state->tile->frame); state->tile->frame = NULL; - if (state->encoder_control->cfg->crypto_features) { + if (state->encoder_control->cfg.crypto_features && state->tile->dbs_g) { DeleteCryptoC(state->tile->dbs_g); } FREE_POINTER(state->tile->wf_jobs); @@ -156,10 +164,6 @@ return 1; } -static void encoder_state_config_slice_finalize(encoder_state_t * const state) { - //Nothing to do (yet?) -} - static int encoder_state_config_wfrow_init(encoder_state_t * const state, const int lcu_offset_y) { @@ -167,10 +171,6 @@ return 1; } -static void encoder_state_config_wfrow_finalize(encoder_state_t * const state) { - //Nothing to do (yet?) -} - #ifdef KVZ_DEBUG_PRINT_THREADING_INFO static void encoder_state_dump_graphviz(const encoder_state_t * const state) { int i; @@ -310,8 +310,6 @@ child_state->children[0].encoder_control = NULL; child_state->tqj_bitstream_written = NULL; child_state->tqj_recon_done = NULL; - child_state->prepared = 0; - child_state->frame_done = 1; if (!parent_state) { const encoder_control_t * const encoder = child_state->encoder_control; @@ -327,6 +325,8 @@ fprintf(stderr, "Could not initialize encoder_state->tile!\n"); return 0; } + + child_state->tile->dbs_g = NULL; // Not used. The used state is in the sub-tile. child_state->slice = MALLOC(encoder_state_config_slice_t, 1); if (!child_state->slice || !encoder_state_config_slice_init(child_state, 0, encoder->in.width_in_lcu * encoder->in.height_in_lcu - 1)) { fprintf(stderr, "Could not initialize encoder_state->slice!\n"); @@ -364,7 +364,10 @@ int children_allow_tile = 0; int range_start; - int start_in_ts, end_in_ts; + // First index of this encoder state in tile scan order. + int start_in_ts; + // Index of the first LCU after this state in tile scan order. + int end_in_ts; switch(child_state->type) { case ENCODER_STATE_TYPE_MAIN: @@ -376,14 +379,16 @@ case ENCODER_STATE_TYPE_SLICE: assert(child_state->parent); if (child_state->parent->type != ENCODER_STATE_TYPE_TILE) children_allow_tile = 1; - children_allow_wavefront_row = encoder->wpp; start_in_ts = child_state->slice->start_in_ts; - end_in_ts = child_state->slice->end_in_ts; + end_in_ts = child_state->slice->end_in_ts + 1; + int num_wpp_rows = (end_in_ts - start_in_ts) / child_state->tile->frame->width_in_lcu; + children_allow_wavefront_row = encoder->cfg.wpp && num_wpp_rows > 1; break; case ENCODER_STATE_TYPE_TILE: assert(child_state->parent); if (child_state->parent->type != ENCODER_STATE_TYPE_SLICE) children_allow_slice = 1; - children_allow_wavefront_row = encoder->wpp; + children_allow_wavefront_row = + encoder->cfg.wpp && child_state->tile->frame->height_in_lcu > 1; start_in_ts = child_state->tile->lcu_offset_in_ts; end_in_ts = child_state->tile->lcu_offset_in_ts + child_state->tile->frame->width_in_lcu * child_state->tile->frame->height_in_lcu; break; @@ -441,8 +446,8 @@ if ((!slice_allowed || (range_end_slice < range_end_tile)) && !new_child && tile_allowed) { //Create a tile int tile_id = encoder->tiles_tile_id[range_start]; - int tile_x = tile_id % encoder->tiles_num_tile_columns; - int tile_y = tile_id / encoder->tiles_num_tile_columns; + int tile_x = tile_id % encoder->cfg.tiles_width_count; + int tile_y = tile_id / encoder->cfg.tiles_width_count; int lcu_offset_x = encoder->tiles_col_bd[tile_x]; int lcu_offset_y = encoder->tiles_row_bd[tile_y]; @@ -456,6 +461,9 @@ new_child->type = ENCODER_STATE_TYPE_TILE; new_child->frame = child_state->frame; new_child->tile = MALLOC(encoder_state_config_tile_t, 1); + if (child_state->encoder_control->cfg.crypto_features) { + new_child->tile->dbs_g = CreateC(); + } new_child->slice = child_state->slice; new_child->wfrow = child_state->wfrow; @@ -680,12 +688,10 @@ state->lcu_order_count = 0; if (!state->parent || (state->parent->wfrow != state->wfrow)) { - encoder_state_config_wfrow_finalize(state); FREE_POINTER(state->wfrow); } if (!state->parent || (state->parent->slice != state->slice)) { - encoder_state_config_slice_finalize(state); FREE_POINTER(state->slice); }
View file
kvazaar-1.0.0.tar.gz/src/encoderstate.c -> kvazaar-1.1.0.tar.gz/src/encoderstate.c
Changed
@@ -196,23 +196,98 @@ } -static void encoder_state_worker_encode_lcu(void * opaque) { +/** + * \brief Sets the QP for each CU in state->tile->frame->cu_array. + * + * The QPs are used in deblocking. + * + * The delta QP for an LCU is coded when the first CU with coded block flag + * set is encountered. Hence, for the purposes of deblocking, all CUs + * before the first one with cbf set use state->ref_qp and all CUs after + * that use state->qp. + * + * \param state encoder state + * \param x x-coordinate of the left edge of the root CU + * \param y y-coordinate of the top edge of the root CU + * \param depth depth in the CU quadtree + * \param coeffs_coded Used for tracking whether a CU with a residual + * has been encountered. Should be set to false at + * the top level. + * \return Whether there were any CUs with residual or not. + */ +static bool set_cu_qps(encoder_state_t *state, int x, int y, int depth, bool coeffs_coded) +{ + if (state->qp == state->ref_qp) { + // If the QPs are equal there is no need to care about the residuals. + coeffs_coded = true; + } + + cu_info_t *cu = kvz_cu_array_at(state->tile->frame->cu_array, x, y); + const int cu_width = LCU_WIDTH >> depth; + coeffs_coded = coeffs_coded || cbf_is_set_any(cu->cbf, cu->depth); + + if (!coeffs_coded && cu->depth > depth) { + // Recursively process sub-CUs. + const int d = cu_width >> 1; + coeffs_coded = set_cu_qps(state, x, y, depth + 1, coeffs_coded); + coeffs_coded = set_cu_qps(state, x + d, y, depth + 1, coeffs_coded); + coeffs_coded = set_cu_qps(state, x, y + d, depth + 1, coeffs_coded); + coeffs_coded = set_cu_qps(state, x + d, y + d, depth + 1, coeffs_coded); + + } else { + if (!coeffs_coded && cu->tr_depth > depth) { + // The CU is split into smaller transform units. Check whether coded + // block flag is set for any of the TUs. + const int tu_width = LCU_WIDTH >> cu->tr_depth; + for (int y_scu = y; y_scu < y + cu_width; y_scu += tu_width) { + for (int x_scu = x; x_scu < x + cu_width; x_scu += tu_width) { + cu_info_t *tu = kvz_cu_array_at(state->tile->frame->cu_array, x_scu, y_scu); + if (cbf_is_set_any(tu->cbf, cu->depth)) { + coeffs_coded = true; + } + } + } + } + + // Set the correct QP for all state->tile->frame->cu_array elements in + // the area covered by the CU. + const int8_t qp = coeffs_coded ? state->qp : state->ref_qp; + + for (int y_scu = y; y_scu < y + cu_width; y_scu += SCU_WIDTH) { + for (int x_scu = x; x_scu < x + cu_width; x_scu += SCU_WIDTH) { + kvz_cu_array_at(state->tile->frame->cu_array, x_scu, y_scu)->qp = qp; + } + } + } + + return coeffs_coded; +} + + +static void encoder_state_worker_encode_lcu(void * opaque) +{ const lcu_order_element_t * const lcu = opaque; encoder_state_t *state = lcu->encoder_state; const encoder_control_t * const encoder = state->encoder_control; videoframe_t* const frame = state->tile->frame; - + + kvz_set_lcu_lambda_and_qp(state, lcu->position); + //This part doesn't write to bitstream, it's only search, deblock and sao kvz_search_lcu(state, lcu->position_px.x, lcu->position_px.y, state->tile->hor_buf_search, state->tile->ver_buf_search); encoder_state_recdata_to_bufs(state, lcu, state->tile->hor_buf_search, state->tile->ver_buf_search); - if (encoder->deblock_enable) { + if (encoder->cfg.deblock_enable) { + if (encoder->cfg.target_bitrate > 0 || encoder->cfg.roi.dqps != NULL) { + set_cu_qps(state, lcu->position_px.x, lcu->position_px.y, 0, false); + } + kvz_filter_deblock_lcu(state, lcu->position_px.x, lcu->position_px.y); } - if (encoder->sao_enable) { + if (encoder->cfg.sao_enable) { kvz_sao_search_lcu(state, lcu->position.x, lcu->position.y); } @@ -239,28 +314,64 @@ } //Now write data to bitstream (required to have a correct CABAC state) - - //First LCU, and we are in a slice. We need a slice header - if (state->type == ENCODER_STATE_TYPE_SLICE && lcu->index == 0) { - kvz_encoder_state_write_bitstream_slice_header(state); - kvz_bitstream_add_rbsp_trailing_bits(&state->stream); - } + const uint64_t existing_bits = kvz_bitstream_tell(&state->stream); //Encode SAO - if (encoder->sao_enable) { + if (encoder->cfg.sao_enable) { encode_sao(state, lcu->position.x, lcu->position.y, &frame->sao_luma[lcu->position.y * frame->width_in_lcu + lcu->position.x], &frame->sao_chroma[lcu->position.y * frame->width_in_lcu + lcu->position.x]); } + + // QP delta is not used when rate control is turned off. + state->must_code_qp_delta = ( + state->encoder_control->cfg.target_bitrate > 0 + || state->encoder_control->cfg.roi.dqps != NULL); + //Encode coding tree kvz_encode_coding_tree(state, lcu->position.x << MAX_DEPTH, lcu->position.y << MAX_DEPTH, 0); - //Terminator - if (lcu->index < state->lcu_order_count - 1) { - //Since we don't handle slice segments, end of slice segment == end of slice - //Always 0 since otherwise it would be split - kvz_cabac_encode_bin_trm(&state->cabac, 0); // end_of_slice_segment_flag + bool end_of_slice_segment_flag; + if (state->encoder_control->cfg.slices & KVZ_SLICES_WPP) { + // Slice segments end after each WPP row. + end_of_slice_segment_flag = lcu->last_column; + } else if (state->encoder_control->cfg.slices & KVZ_SLICES_TILES) { + // Slices end after each tile. + end_of_slice_segment_flag = lcu->last_column && lcu->last_row; + } else { + // Slice ends after the last row of the last tile. + int last_tile_id = -1 + encoder->cfg.tiles_width_count * encoder->cfg.tiles_height_count; + bool is_last_tile = state->tile->id == last_tile_id; + end_of_slice_segment_flag = is_last_tile && lcu->last_column && lcu->last_row; } - + kvz_cabac_encode_bin_trm(&state->cabac, end_of_slice_segment_flag); + + { + const bool end_of_tile = lcu->last_column && lcu->last_row; + const bool end_of_wpp_row = encoder->cfg.wpp && lcu->last_column; + + + if (end_of_tile || end_of_wpp_row) { + if (!end_of_slice_segment_flag) { + // end_of_sub_stream_one_bit + kvz_cabac_encode_bin_trm(&state->cabac, 1); + } + + // Finish the substream by writing out remaining state. + kvz_cabac_finish(&state->cabac); + + // Write a rbsp_trailing_bits or a byte_alignment. The first one is used + // for ending a slice_segment_layer_rbsp and the second one for ending + // a substream. They are identical and align the byte stream. + kvz_bitstream_put(state->cabac.stream, 1, 1); + kvz_bitstream_align_zero(state->cabac.stream); + + kvz_cabac_start(&state->cabac); + } + } + + const uint32_t bits = kvz_bitstream_tell(&state->stream) - existing_bits; + kvz_get_lcu_stats(state, lcu->position.x, lcu->position.y)->bits = bits; + //Wavefronts need the context to be copied to the next row if (state->type == ENCODER_STATE_TYPE_WAVEFRONT_ROW && lcu->index == 1) { int j; @@ -273,7 +384,7 @@ } } - if (encoder->sao_enable && lcu->above) { + if (encoder->cfg.sao_enable && lcu->above) { // Add the post-deblocking but pre-SAO pixels of the LCU row above this // row to a buffer so this row can use them on it's own SAO // reconstruction. @@ -296,8 +407,14 @@ assert(state->is_leaf); assert(state->lcu_order_count > 0); - const kvz_config *cfg = state->encoder_control->cfg; - + const kvz_config *cfg = &state->encoder_control->cfg; + if (cfg->crypto_features) { + InitC(state->tile->dbs_g); + state->tile->m_prev_pos = 0;
View file
kvazaar-1.0.0.tar.gz/src/encoderstate.h -> kvazaar-1.1.0.tar.gz/src/encoderstate.h
Changed
@@ -49,18 +49,48 @@ } encoder_state_type; +typedef struct lcu_stats_t { + //! \brief Number of bits that were spent + uint32_t bits; + + //! \brief Weight of the LCU for rate control + double weight; + + //! \brief Lambda value which was used for this LCU + double lambda; + + //! \brief Rate control alpha parameter + double rc_alpha; + + //! \brief Rate control beta parameter + double rc_beta; +} lcu_stats_t; + typedef struct encoder_state_config_frame_t { - double cur_lambda_cost; //!< \brief Lambda for SSE - double cur_lambda_cost_sqrt; //!< \brief Lambda for SAD and SATD - + /** + * \brief Frame-level lambda. + * + * Use state->lambda or state->lambda_sqrt for cost computations. + * + * \see encoder_state_t::lambda + * \see encoder_state_t::lambda_sqrt + */ + double lambda; + int32_t num; /*!< \brief Frame number */ int32_t poc; /*!< \brief Picture order count */ int8_t gop_offset; /*!< \brief Offset in the gop structure */ - - int8_t QP; //!< \brief Quantization parameter - double QP_factor; //!< \brief Quantization factor - + + /** + * \brief Frame-level quantization parameter + * + * \see encoder_state_t::qp + */ + int8_t QP; + //! \brief quantization factor + double QP_factor; + //Current picture available references image_list_t *ref; int8_t ref_list; @@ -84,10 +114,38 @@ //! Number of bits targeted for the current GOP. double cur_gop_target_bits; + //! Number of bits targeted for the current picture. + double cur_pic_target_bits; + // Parameters used in rate control double rc_alpha; double rc_beta; + /** + * \brief Indicates that this encoder state is ready for encoding the + * next frame i.e. kvz_encoder_prepare has been called. + */ + bool prepared; + + /** + * \brief Indicates that the previous frame has been encoded and the + * encoded data written and the encoding the next frame has not been + * started yet. + */ + bool done; + + /** + * \brief Information about the coded LCUs. + * + * Used for rate control. + */ + lcu_stats_t *lcu_stats; + + /** + * \brief Whether next NAL is the first NAL in the access unit. + */ + bool first_nal; + } encoder_state_config_frame_t; typedef struct encoder_state_config_tile_t { @@ -185,21 +243,26 @@ bitstream_t stream; cabac_data_t cabac; + uint32_t stats_bitstream_length; //Bitstream length written in bytes + + //! \brief Lambda for SSE + double lambda; + //! \brief Lambda for SAD and SATD + double lambda_sqrt; + //! \brief Quantization parameter for the current LCU + int8_t qp; + /** - * \brief Indicates that this encoder state is ready for encoding the - * next frame i.e. kvz_encoder_prepare has been called. + * \brief Whether a QP delta value must be coded for the current LCU. */ - int prepared; + bool must_code_qp_delta; /** - * \brief Indicates that the previous frame has been encoded and the - * encoded data written and the encoding the next frame has not been - * started yet. + * \brief Reference for computing QP delta for the next LCU that is coded + * next. Updated whenever a QP delta is coded. */ - int frame_done; + int8_t ref_qp; - uint32_t stats_bitstream_length; //Bitstream length written in bytes - //Jobs to wait for threadqueue_job_t * tqj_recon_done; //Reconstruction is done threadqueue_job_t * tqj_bitstream_written; //Bitstream is written @@ -218,6 +281,21 @@ int ref_list_len_out[2], int ref_list_poc_out[2][16]); +lcu_stats_t* kvz_get_lcu_stats(encoder_state_t *state, int lcu_x, int lcu_y); + + +/** + * Whether the parameter sets should be written with the current frame. + */ +static INLINE bool encoder_state_must_write_vps(const encoder_state_t *state) +{ + const int32_t frame = state->frame->num; + const int32_t vps_period = state->encoder_control->cfg.vps_period; + + return (vps_period > 0 && frame % vps_period == 0) || + (vps_period >= 0 && frame == 0); +} + static const uint8_t g_group_idx[32] = { 0, 1, 2, 3, 4, 4, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8,
View file
kvazaar-1.0.0.tar.gz/src/extras/crypto.cpp -> kvazaar-1.1.0.tar.gz/src/extras/crypto.cpp
Changed
@@ -19,9 +19,12 @@ } AESDecoder; -AESDecoder* Init() { +AESDecoder* Create() { + AESDecoder * AESdecoder = (AESDecoder *)malloc(sizeof(AESDecoder)); + return AESdecoder; +} +void Init(AESDecoder* AESdecoder) { int init_val[32] = {201, 75, 219, 152, 6, 245, 237, 107, 179, 194, 81, 29, 66, 98, 198, 0, 16, 213, 27, 56, 255, 127, 242, 112, 97, 126, 197, 204, 25, 59, 38, 30}; - AESDecoder * AESdecoder = (AESDecoder *)malloc(sizeof(AESDecoder)); for(int i=0;i<16; i++) { AESdecoder->iv [i] = init_val[i]; AESdecoder->counter[i] = init_val[5+i]; @@ -35,7 +38,6 @@ AESdecoder->couter_avail = 0; AESdecoder->counter_index = 0; AESdecoder->counter_index_pos = 0; - return AESdecoder; } void DeleteCrypto(AESDecoder * AESdecoder) { @@ -105,11 +107,15 @@ return key_; } #endif +Crypto_Handle CreateC() { + AESDecoder* AESdecoder = Create(); + return AESdecoder; +} -Crypto_Handle InitC(){ - AESDecoder* AESdecoder = Init(); - return AESdecoder; +void InitC(Crypto_Handle hdl) { + Init((AESDecoder*)hdl); } + #if AESEncryptionStreamMode unsigned int ff_get_key (Crypto_Handle *hdl, int nb_bits) { return get_key ((AESDecoder*)*hdl, nb_bits);
View file
kvazaar-1.0.0.tar.gz/src/extras/crypto.h -> kvazaar-1.1.0.tar.gz/src/extras/crypto.h
Changed
@@ -16,8 +16,8 @@ extern "C" { #endif typedef void* Crypto_Handle; - - STUBBED Crypto_Handle InitC(); + STUBBED Crypto_Handle CreateC(); + STUBBED void InitC(Crypto_Handle hdl); STUBBED void DecryptC(Crypto_Handle hdl, const unsigned char *in_stream, int size_bits, unsigned char *out_stream); #if AESEncryptionStreamMode STUBBED unsigned int ff_get_key(Crypto_Handle *hdl, int nb_bits); @@ -36,27 +36,36 @@ // Provide them in the header so we can avoid compiling the cpp file, which // means we don't need a C++ compiler when crypto is not enabled. -#include <assert.h> +#include <stdio.h> +#include <stdint.h> +#include <inttypes.h> -static INLINE Crypto_Handle InitC() -{ - // Stub. - assert(0); - return 0; +static uintptr_t handle_id = 1; + +static INLINE Crypto_Handle CreateC() { + printf("Crypto CreateC %" PRIuPTR "\n", handle_id); + return (void*)(handle_id++); +} +static INLINE void InitC(Crypto_Handle hdl) { + printf("Crypto InitC %" PRIuPTR "\n", (uintptr_t)hdl); } static INLINE void DecryptC(Crypto_Handle hdl, const unsigned char *in_stream, int size_bits, unsigned char *out_stream) { // Stub. - assert(0); + printf("Crypto DecryptC %" PRIuPTR "\n", (uintptr_t)hdl); } #if AESEncryptionStreamMode static INLINE unsigned int ff_get_key(Crypto_Handle *hdl, int nb_bits) { // Stub. - assert(0); + static Crypto_Handle ff_get_key_last_hdl = 0; + if (*hdl != ff_get_key_last_hdl) { + printf("Crypto ff_get_key %" PRIuPTR "\n", (uintptr_t)*hdl); + } + ff_get_key_last_hdl = *hdl; return 0; } #endif @@ -64,7 +73,7 @@ static INLINE void DeleteCryptoC(Crypto_Handle hdl) { // Stub. - assert(0); + printf("Crypto DeleteCryptoC %" PRIuPTR "\n", (uintptr_t)hdl); } #endif // KVZ_SEL_ENCRYPTION
View file
kvazaar-1.0.0.tar.gz/src/filter.c -> kvazaar-1.1.0.tar.gz/src/filter.c
Changed
@@ -75,67 +75,80 @@ // FUNCTIONS /** - * \brief + * \brief Perform in strong luma filtering in place. + * \param line line of 8 pixels, with center at index 4 + * \param tc tc treshold + * \return Reach of the filter starting from center. */ -static INLINE void kvz_filter_deblock_luma(const encoder_control_t * const encoder, - kvz_pixel *src, - int32_t offset, - int32_t tc, - int8_t sw, - int8_t part_P_nofilter, - int8_t part_Q_nofilter, - int32_t thr_cut, - int8_t filter_second_P, - int8_t filter_second_Q) +static INLINE int kvz_filter_deblock_luma_strong( + kvz_pixel *line, + int32_t tc) { - int32_t delta; + const kvz_pixel m0 = line[0]; + const kvz_pixel m1 = line[1]; + const kvz_pixel m2 = line[2]; + const kvz_pixel m3 = line[3]; + const kvz_pixel m4 = line[4]; + const kvz_pixel m5 = line[5]; + const kvz_pixel m6 = line[6]; + const kvz_pixel m7 = line[7]; + + line[1] = CLIP(m1 - 2*tc, m1 + 2*tc, (2*m0 + 3*m1 + m2 + m3 + m4 + 4) >> 3); + line[2] = CLIP(m2 - 2*tc, m2 + 2*tc, ( m1 + m2 + m3 + m4 + 2) >> 2); + line[3] = CLIP(m3 - 2*tc, m3 + 2*tc, ( m1 + 2*m2 + 2*m3 + 2*m4 + m5 + 4) >> 3); + line[4] = CLIP(m4 - 2*tc, m4 + 2*tc, ( m2 + 2*m3 + 2*m4 + 2*m5 + m6 + 4) >> 3); + line[5] = CLIP(m5 - 2*tc, m5 + 2*tc, ( m3 + m4 + m5 + m6 + 2) >> 2); + line[6] = CLIP(m6 - 2*tc, m6 + 2*tc, ( m3 + m4 + m5 + 3*m6 + 2*m7 + 4) >> 3); + + return 3; +} - int16_t m0 = src[-offset * 4]; - int16_t m1 = src[-offset * 3]; - int16_t m2 = src[-offset * 2]; - int16_t m3 = src[-offset]; - int16_t m4 = src[0]; - int16_t m5 = src[offset]; - int16_t m6 = src[offset * 2]; - int16_t m7 = src[offset * 3]; - - if (sw) { - src[-offset * 3] = CLIP(m1 - 2*tc, m1 + 2*tc, (2*m0 + 3*m1 + m2 + m3 + m4 + 4) >> 3); - src[-offset * 2] = CLIP(m2 - 2*tc, m2 + 2*tc, ( m1 + m2 + m3 + m4 + 2) >> 2); - src[-offset] = CLIP(m3 - 2*tc, m3 + 2*tc, ( m1 + 2*m2 + 2*m3 + 2*m4 + m5 + 4) >> 3); - src[0] = CLIP(m4 - 2*tc, m4 + 2*tc, ( m2 + 2*m3 + 2*m4 + 2*m5 + m6 + 4) >> 3); - src[offset] = CLIP(m5 - 2*tc, m5 + 2*tc, ( m3 + m4 + m5 + m6 + 2) >> 2); - src[offset * 2] = CLIP(m6 - 2*tc, m6 + 2*tc, ( m3 + m4 + m5 + 3*m6 + 2*m7 + 4) >> 3); +/** + * \brief Perform in weak luma filtering in place. + * \param encoder Encoder + * \param line Line of 8 pixels, with center at index 4 + * \param tc The tc treshold + * \param p_2nd Whether to filter the 2nd line of P + * \param q_2nd Whether to filter the 2nd line of Q + */ +static INLINE int kvz_filter_deblock_luma_weak( + const encoder_control_t * const encoder, + kvz_pixel *line, + int32_t tc, + bool p_2nd, + bool q_2nd) +{ + const kvz_pixel m1 = line[1]; + const kvz_pixel m2 = line[2]; + const kvz_pixel m3 = line[3]; + const kvz_pixel m4 = line[4]; + const kvz_pixel m5 = line[5]; + const kvz_pixel m6 = line[6]; + + int32_t delta = (9 * (m4 - m3) - 3 * (m5 - m2) + 8) >> 4; + + if (abs(delta) >= tc * 10) { + return 0; } else { - // Weak filter - delta = (9*(m4 - m3) - 3*(m5 - m2) + 8) >> 4; - - if (abs(delta) < thr_cut) { - int32_t tc2 = tc >> 1; - delta = CLIP(-tc, tc, delta); - src[-offset] = CLIP(0, (1 << encoder->bitdepth) - 1, (m3 + delta)); - src[0] = CLIP(0, (1 << encoder->bitdepth) - 1, (m4 - delta)); - - if(filter_second_P) { - int32_t delta1 = CLIP(-tc2, tc2, (((m1 + m3 + 1) >> 1) - m2 + delta) >> 1); - src[-offset * 2] = CLIP(0, (1 << encoder->bitdepth) - 1, m2 + delta1); - } - if(filter_second_Q) { - int32_t delta2 = CLIP(-tc2, tc2, (((m6 + m4 + 1) >> 1) - m5 - delta) >> 1); - src[offset] = CLIP(0, (1 << encoder->bitdepth) - 1, m5 + delta2); - } + int32_t tc2 = tc >> 1; + delta = CLIP(-tc, tc, delta); + line[3] = CLIP(0, (1 << encoder->bitdepth) - 1, (m3 + delta)); + line[4] = CLIP(0, (1 << encoder->bitdepth) - 1, (m4 - delta)); + + if (p_2nd) { + int32_t delta1 = CLIP(-tc2, tc2, (((m1 + m3 + 1) >> 1) - m2 + delta) >> 1); + line[2] = CLIP(0, (1 << encoder->bitdepth) - 1, m2 + delta1); + } + if (q_2nd) { + int32_t delta2 = CLIP(-tc2, tc2, (((m6 + m4 + 1) >> 1) - m5 - delta) >> 1); + line[5] = CLIP(0, (1 << encoder->bitdepth) - 1, m5 + delta2); + } + + if (p_2nd || q_2nd) { + return 2; + } else { + return 1; } - } - - if(part_P_nofilter) { - src[-offset] = (kvz_pixel)m3; - src[-offset * 2] = (kvz_pixel)m2; - src[-offset * 3] = (kvz_pixel)m1; - } - if(part_Q_nofilter) { - src[0] = (kvz_pixel)m4; - src[offset] = (kvz_pixel)m5; - src[offset * 2] = (kvz_pixel)m6; } } @@ -247,6 +260,59 @@ } } +static int8_t get_qp_y_pred(const encoder_state_t* state, int x, int y, edge_dir dir) +{ + if (state->encoder_control->cfg.target_bitrate <= 0 + && state->encoder_control->cfg.roi.dqps == NULL) + { + return state->qp; + } + + int32_t qp_p; + if (dir == EDGE_HOR && y > 0) { + qp_p = kvz_cu_array_at_const(state->tile->frame->cu_array, x, y - 1)->qp; + } else if (dir == EDGE_VER && x > 0) { + qp_p = kvz_cu_array_at_const(state->tile->frame->cu_array, x - 1, y)->qp; + } else { + qp_p = state->frame->QP; + } + + const int32_t qp_q = + kvz_cu_array_at_const(state->tile->frame->cu_array, x, y)->qp; + + return (qp_p + qp_q + 1) >> 1; +} + +/** + * \brief Gather pixels needed for deblocking + */ +static INLINE void gather_deblock_pixels( + const kvz_pixel *src, + int step, + int stride, + int reach, + kvz_pixel *dst) +{ + for (int i = -reach; i < +reach; ++i) { + dst[i + 4] = src[i * step + stride]; + } +} + +/** +* \brief Scatter pixels +*/ +static INLINE void scatter_deblock_pixels( + const kvz_pixel *src, + int step, + int stride, + int reach, + kvz_pixel *dst) +{ + for (int i = -reach; i < +reach; ++i) { + dst[i * step + stride] = src[i + 4]; + } +} + /** * \brief Apply the deblocking filter to luma pixels on a single edge. * @@ -284,33 +350,33 @@ {
View file
kvazaar-1.0.0.tar.gz/src/global.h -> kvazaar-1.1.0.tar.gz/src/global.h
Changed
@@ -151,13 +151,13 @@ #define LCU_CHROMA_SIZE (LCU_WIDTH * LCU_WIDTH >> 2) #define MAX_REF_PIC_COUNT 16 -#define DEFAULT_REF_PIC_COUNT 3 #define AMVP_MAX_NUM_CANDS 2 #define AMVP_MAX_NUM_CANDS_MEM 3 #define MRG_MAX_NUM_CANDS 5 /* Some tools */ +#define ABS(a) ((a) >= 0 ? (a) : (-a)) #define MAX(a,b) (((a)>(b))?(a):(b)) #define MIN(a,b) (((a)<(b))?(a):(b)) #define CLIP(low,high,value) MAX((low),MIN((high),(value))) @@ -181,7 +181,7 @@ // NOTE: When making a release, check to see if incrementing libversion in // configure.ac is necessary. #ifndef KVZ_VERSION -#define KVZ_VERSION 0.8.3 +#define KVZ_VERSION 1.1.0 #endif #define VERSION_STRING QUOTE_EXPAND(KVZ_VERSION) @@ -246,14 +246,12 @@ #define MAX_TR_DYNAMIC_RANGE 15 -#define EXP_GOLOMB_TABLE_SIZE (4096*8) - //Constants typedef enum { COLOR_Y = 0, COLOR_U, COLOR_V } color_t; // Hardware data (abstraction of defines). Extend for other compilers -#if defined(_M_IX86) || defined(__i586__) || defined(__i686__) || defined(_M_X64) || defined(_M_AMD64) || defined(__amd64__) || defined(__x86_64__) +#if defined(_M_IX86) || defined(__i386__) || defined(__i486__) || defined(__i586__) || defined(__i686__) || defined(_M_X64) || defined(_M_AMD64) || defined(__amd64__) || defined(__x86_64__) # define COMPILE_INTEL 1 #else # define COMPILE_INTEL 0
View file
kvazaar-1.0.0.tar.gz/src/image.c -> kvazaar-1.1.0.tar.gz/src/image.c
Changed
@@ -478,24 +478,6 @@ } -unsigned kvz_pixels_calc_ssd(const kvz_pixel *const ref, const kvz_pixel *const rec, - const int ref_stride, const int rec_stride, - const int width) -{ - int ssd = 0; - int y, x; - - for (y = 0; y < width; ++y) { - for (x = 0; x < width; ++x) { - int diff = ref[x + y * ref_stride] - rec[x + y * rec_stride]; - ssd += diff * diff; - } - } - - return ssd; -} - - /** * \brief BLock Image Transfer from one buffer to another. *
View file
kvazaar-1.0.0.tar.gz/src/image.h -> kvazaar-1.1.0.tar.gz/src/image.h
Changed
@@ -78,11 +78,6 @@ int block_width, int block_height, int max_lcu_below); -unsigned kvz_pixels_calc_ssd(const kvz_pixel *const ref, const kvz_pixel *const rec, - const int ref_stride, const int rec_stride, - const int width); - - void kvz_pixels_blit(const kvz_pixel* orig, kvz_pixel *dst, unsigned width, unsigned height, unsigned orig_stride, unsigned dst_stride);
View file
kvazaar-1.0.0.tar.gz/src/imagelist.c -> kvazaar-1.1.0.tar.gz/src/imagelist.c
Changed
@@ -114,7 +114,7 @@ unsigned new_size = MAX(list->size + 1, list->size * 2); if (!kvz_image_list_resize(list, new_size)) return 0; } - + for (i = list->used_size; i > 0; i--) { list->images[i] = list->images[i - 1]; list->cu_arrays[i] = list->cu_arrays[i - 1];
View file
kvazaar-1.0.0.tar.gz/src/input_frame_buffer.c -> kvazaar-1.1.0.tar.gz/src/input_frame_buffer.c
Changed
@@ -54,7 +54,7 @@ kvz_picture *const img_in) { const encoder_control_t* const encoder = state->encoder_control; - const kvz_config* const cfg = encoder->cfg; + const kvz_config* const cfg = &encoder->cfg; const int gop_buf_size = 3 * cfg->gop_len; @@ -67,7 +67,8 @@ img_in->dts = img_in->pts; state->frame->gop_offset = 0; - if (cfg->gop_lowdelay) { + if (cfg->gop_len > 0) { + // Using a low delay GOP structure. state->frame->gop_offset = (state->frame->num - 1) % cfg->gop_len; if (state->frame->gop_offset < 0) { // Set gop_offset of IDR as the highest quality picture.
View file
kvazaar-1.0.0.tar.gz/src/inter.c -> kvazaar-1.1.0.tar.gz/src/inter.c
Changed
@@ -325,7 +325,7 @@ // Generate prediction for luma. if (fractional_luma) { // With a fractional MV, do interpolation. - if (state->encoder_control->cfg->bipred && hi_prec_out) { + if (state->encoder_control->cfg.bipred && hi_prec_out) { inter_recon_14bit_frac_luma(state, ref, pu_in_tile.x, pu_in_tile.y, width, height, @@ -361,7 +361,7 @@ // Generate prediction for chroma. if (fractional_luma || fractional_chroma) { // With a fractional MV, do interpolation. - if (state->encoder_control->cfg->bipred && hi_prec_out) { + if (state->encoder_control->cfg.bipred && hi_prec_out) { inter_recon_14bit_frac_chroma(state, ref, pu_in_tile.x, pu_in_tile.y, width, height, @@ -657,7 +657,9 @@ int32_t width, int32_t height, cu_info_t **C3, - cu_info_t **H) { + cu_info_t **H, + uint8_t ref_list, + uint8_t ref_idx) { /* Predictor block locations _________ @@ -671,22 +673,21 @@ *C3 = NULL; *H = NULL; - // Find temporal reference, closest POC + // Find temporal reference if (state->frame->ref->used_size) { - uint32_t poc_diff = UINT_MAX; - int32_t closest_ref = 0; + uint32_t colocated_ref = UINT_MAX; + // Select L0/L1 ref_idx reference for (int temporal_cand = 0; temporal_cand < state->frame->ref->used_size; temporal_cand++) { - int td = state->frame->poc - state->frame->ref->pocs[temporal_cand]; - - td = td < 0 ? -td : td; - if (td < poc_diff) { - closest_ref = temporal_cand; - poc_diff = td; + if (state->frame->refmap[temporal_cand].list == ref_list && state->frame->refmap[temporal_cand].idx == ref_idx) { + colocated_ref = temporal_cand; + break; } } + + if (colocated_ref == UINT_MAX) return; - cu_array_t *ref_cu_array = state->frame->ref->cu_arrays[closest_ref]; + cu_array_t *ref_cu_array = state->frame->ref->cu_arrays[colocated_ref]; int cu_per_width = ref_cu_array->width / SCU_WIDTH; uint32_t xColBr = x + width; @@ -916,9 +917,9 @@ } /** - * \brief Pick two mv candidates from the spatial candidates. + * \brief Pick two mv candidates from the spatial and temporal candidates. */ -static void get_mv_cand_from_spatial(const encoder_state_t * const state, +static void get_mv_cand_from_candidates(const encoder_state_t * const state, int32_t x, int32_t y, int32_t width, @@ -1081,44 +1082,71 @@ } // Remove identical candidate - if(candidates == 2 && mv_cand[0][0] == mv_cand[1][0] && mv_cand[0][1] == mv_cand[1][1]) { + if (candidates == 2 && mv_cand[0][0] == mv_cand[1][0] && mv_cand[0][1] == mv_cand[1][1]) { candidates = 1; } - if (state->encoder_control->cfg->tmvp_enable) { + // Use Temporal Motion Vector Prediction when enabled + if (state->encoder_control->cfg.tmvp_enable) { /* Predictor block locations - _________ + __________ |CurrentPU| | |C0|__ | | |C3| | |_________|_ - |H| + |H| */ - // Find temporal reference, closest POC + // TMVP required at least two sequential P/B-frames if (state->frame->poc > 1 && state->frame->ref->used_size && candidates < AMVP_MAX_NUM_CANDS) { - uint32_t poc_diff = UINT_MAX; - - for (int temporal_cand = 0; temporal_cand < state->frame->ref->used_size; temporal_cand++) { - int td = state->frame->poc - state->frame->ref->pocs[temporal_cand]; - td = td < 0 ? -td : td; - if (td < poc_diff) { - poc_diff = td; - } - } + // Use "H" as the primary predictor and "C3" as secondary const cu_info_t *selected_CU = (h != NULL) ? h : (c3 != NULL) ? c3 : NULL; if (selected_CU) { - int td = selected_CU->inter.mv_ref[reflist] + 1; - int tb = cur_cu->inter.mv_ref[reflist] + 1; + uint32_t colocated_ref = UINT_MAX; + uint32_t colocated_ref_poc = 0; + int td, tb; + + //ToDo: allow other than L0[0] for prediction + + //Fetch ref idx of the selected CU in L0[0] ref list + for (int temporal_cand = 0; temporal_cand < state->frame->ref->used_size; temporal_cand++) { + if (state->frame->refmap[temporal_cand].list == 1 && state->frame->refmap[temporal_cand].idx == 0) { + colocated_ref = temporal_cand; + break; + } + } + + if (colocated_ref != UINT_MAX) { + + uint8_t used_reflist = reflist; + + colocated_ref_poc = state->frame->ref->pocs[colocated_ref]; + + if (!(selected_CU->inter.mv_dir & (1 << used_reflist))) { + used_reflist = !reflist; + } - int scale = CALCULATE_SCALE(NULL, tb, td); - mv_cand[candidates][0] = ((scale * selected_CU->inter.mv[0][0] + 127 + (scale * selected_CU->inter.mv[0][0] < 0)) >> 8); - mv_cand[candidates][1] = ((scale * selected_CU->inter.mv[0][1] + 127 + (scale * selected_CU->inter.mv[0][1] < 0)) >> 8); + // The reference id the colocated block is using + uint32_t colocated_ref_mv_ref = selected_CU->inter.mv_ref[used_reflist]; - candidates++; + td = colocated_ref_poc - state->frame->ref->images[colocated_ref]->ref_pocs[colocated_ref_mv_ref]; + tb = state->frame->poc - state->frame->ref->pocs[cur_cu->inter.mv_ref[reflist]]; + + if (td == tb) { + mv_cand[candidates][0] = selected_CU->inter.mv[used_reflist][0]; + mv_cand[candidates][1] = selected_CU->inter.mv[used_reflist][1]; + } else { + int scale = CALCULATE_SCALE(NULL, tb, td); + mv_cand[candidates][0] = ((scale * selected_CU->inter.mv[used_reflist][0] + 127 + ((scale * selected_CU->inter.mv[used_reflist][0]) < 0)) >> 8); + mv_cand[candidates][1] = ((scale * selected_CU->inter.mv[used_reflist][1] + 127 + ((scale * selected_CU->inter.mv[used_reflist][1]) < 0)) >> 8); + } + + candidates++; + + } } #undef CALCULATE_SCALE } @@ -1162,8 +1190,8 @@ get_spatial_merge_candidates(x, y, width, height, state->tile->frame->width, state->tile->frame->height, &b0, &b1, &b2, &a0, &a1, lcu); - kvz_inter_get_temporal_merge_candidates(state, x, y, width, height, &c3, &h); - get_mv_cand_from_spatial(state, x, y, width, height, b0, b1, b2, a0, a1, c3, h, cur_cu, reflist, mv_cand); + kvz_inter_get_temporal_merge_candidates(state, x, y, width, height, &c3, &h, 1, 0); + get_mv_cand_from_candidates(state, x, y, width, height, b0, b1, b2, a0, a1, c3, h, cur_cu, reflist, mv_cand); } /** @@ -1196,8 +1224,8 @@ x, y, width, height, state->tile->frame->width, state->tile->frame->height, &b0, &b1, &b2, &a0, &a1); - kvz_inter_get_temporal_merge_candidates(state, x, y, width, height, &c3, &h); - get_mv_cand_from_spatial(state, x, y, width, height, b0, b1, b2, a0, a1, c3, h, cur_cu, reflist, mv_cand); + kvz_inter_get_temporal_merge_candidates(state, x, y, width, height, &c3, &h, 1, 0); + get_mv_cand_from_candidates(state, x, y, width, height, b0, b1, b2, a0, a1, c3, h, cur_cu, reflist, mv_cand); } /** @@ -1211,6 +1239,7 @@ * \param use_b1 true, if candidate b1 can be used * \param mv_cand Returns the merge candidates. * \param lcu lcu containing the block + * \param ref_idx current reference index (used only by TMVP) * \return number of merge candidates */ uint8_t kvz_inter_get_merge_cand(const encoder_state_t * const state, @@ -1218,7 +1247,8 @@ int32_t width, int32_t height, bool use_a1, bool use_b1, inter_merge_cand_t mv_cand[MRG_MAX_NUM_CANDS], - lcu_t *lcu)
View file
kvazaar-1.0.0.tar.gz/src/inter.h -> kvazaar-1.1.0.tar.gz/src/inter.h
Changed
@@ -85,5 +85,6 @@ int32_t width, int32_t height, bool use_a1, bool use_b1, inter_merge_cand_t mv_cand[MRG_MAX_NUM_CANDS], - lcu_t *lcu); + lcu_t *lcu, + uint8_t ref_idx); #endif
View file
kvazaar-1.0.0.tar.gz/src/intra.c -> kvazaar-1.1.0.tar.gz/src/intra.c
Changed
@@ -29,6 +29,45 @@ #include "transform.h" #include "videoframe.h" +// Tables for looking up the number of intra reference pixels based on +// prediction units coordinate within an LCU. +// generated by "tools/generate_ref_pixel_tables.py". +static const uint8_t num_ref_pixels_top[16][16] = { + { 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 32, 28, 24, 20, 16, 12, 8, 4, 32, 28, 24, 20, 16, 12, 8, 4 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 64, 60, 56, 52, 48, 44, 40, 36, 32, 28, 24, 20, 16, 12, 8, 4 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 32, 28, 24, 20, 16, 12, 8, 4, 32, 28, 24, 20, 16, 12, 8, 4 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 } +}; +static const uint8_t num_ref_pixels_left[16][16] = { + { 64, 4, 8, 4, 16, 4, 8, 4, 32, 4, 8, 4, 16, 4, 8, 4 }, + { 60, 4, 4, 4, 12, 4, 4, 4, 28, 4, 4, 4, 12, 4, 4, 4 }, + { 56, 4, 8, 4, 8, 4, 8, 4, 24, 4, 8, 4, 8, 4, 8, 4 }, + { 52, 4, 4, 4, 4, 4, 4, 4, 20, 4, 4, 4, 4, 4, 4, 4 }, + { 48, 4, 8, 4, 16, 4, 8, 4, 16, 4, 8, 4, 16, 4, 8, 4 }, + { 44, 4, 4, 4, 12, 4, 4, 4, 12, 4, 4, 4, 12, 4, 4, 4 }, + { 40, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 36, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4 }, + { 32, 4, 8, 4, 16, 4, 8, 4, 32, 4, 8, 4, 16, 4, 8, 4 }, + { 28, 4, 4, 4, 12, 4, 4, 4, 28, 4, 4, 4, 12, 4, 4, 4 }, + { 24, 4, 8, 4, 8, 4, 8, 4, 24, 4, 8, 4, 8, 4, 8, 4 }, + { 20, 4, 4, 4, 4, 4, 4, 4, 20, 4, 4, 4, 4, 4, 4, 4 }, + { 16, 4, 8, 4, 16, 4, 8, 4, 16, 4, 8, 4, 16, 4, 8, 4 }, + { 12, 4, 4, 4, 12, 4, 4, 4, 12, 4, 4, 4, 12, 4, 4, 4 }, + { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, + { 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4 } +}; int8_t kvz_intra_get_dir_luma_predictor( const uint32_t x, @@ -246,7 +285,7 @@ } -void kvz_intra_build_reference( +void kvz_intra_build_reference_any( const int_fast8_t log2_width, const color_t color, const vector2d_t *const luma_px, @@ -256,46 +295,6 @@ { assert(log2_width >= 2 && log2_width <= 5); - // Tables for looking up the number of intra reference pixels based on - // prediction units coordinate within an LCU. - // generated by "tools/generate_ref_pixel_tables.py". - static const uint8_t num_ref_pixels_top[16][16] = { - { 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 32, 28, 24, 20, 16, 12, 8, 4, 32, 28, 24, 20, 16, 12, 8, 4 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 64, 60, 56, 52, 48, 44, 40, 36, 32, 28, 24, 20, 16, 12, 8, 4 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 32, 28, 24, 20, 16, 12, 8, 4, 32, 28, 24, 20, 16, 12, 8, 4 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4, 16, 12, 8, 4 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 } - }; - static const uint8_t num_ref_pixels_left[16][16] = { - { 64, 4, 8, 4, 16, 4, 8, 4, 32, 4, 8, 4, 16, 4, 8, 4 }, - { 60, 4, 4, 4, 12, 4, 4, 4, 28, 4, 4, 4, 12, 4, 4, 4 }, - { 56, 4, 8, 4, 8, 4, 8, 4, 24, 4, 8, 4, 8, 4, 8, 4 }, - { 52, 4, 4, 4, 4, 4, 4, 4, 20, 4, 4, 4, 4, 4, 4, 4 }, - { 48, 4, 8, 4, 16, 4, 8, 4, 16, 4, 8, 4, 16, 4, 8, 4 }, - { 44, 4, 4, 4, 12, 4, 4, 4, 12, 4, 4, 4, 12, 4, 4, 4 }, - { 40, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 36, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4 }, - { 32, 4, 8, 4, 16, 4, 8, 4, 32, 4, 8, 4, 16, 4, 8, 4 }, - { 28, 4, 4, 4, 12, 4, 4, 4, 28, 4, 4, 4, 12, 4, 4, 4 }, - { 24, 4, 8, 4, 8, 4, 8, 4, 24, 4, 8, 4, 8, 4, 8, 4 }, - { 20, 4, 4, 4, 4, 4, 4, 4, 20, 4, 4, 4, 4, 4, 4, 4 }, - { 16, 4, 8, 4, 16, 4, 8, 4, 16, 4, 8, 4, 16, 4, 8, 4 }, - { 12, 4, 4, 4, 12, 4, 4, 4, 12, 4, 4, 4, 12, 4, 4, 4 }, - { 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4, 8, 4 }, - { 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4 } - }; - refs->filtered_initialized = false; kvz_pixel *out_left_ref = &refs->ref.left[0]; kvz_pixel *out_top_ref = &refs->ref.top[0]; @@ -410,6 +409,137 @@ } } +void kvz_intra_build_reference_inner( + const int_fast8_t log2_width, + const color_t color, + const vector2d_t *const luma_px, + const vector2d_t *const pic_px, + const lcu_t *const lcu, + kvz_intra_references *const refs) +{ + assert(log2_width >= 2 && log2_width <= 5); + + refs->filtered_initialized = false; + kvz_pixel * __restrict out_left_ref = &refs->ref.left[0]; + kvz_pixel * __restrict out_top_ref = &refs->ref.top[0]; + + const int is_chroma = color != COLOR_Y ? 1 : 0; + const int_fast8_t width = 1 << log2_width; + + // Convert luma coordinates to chroma coordinates for chroma. + const vector2d_t lcu_px = { + luma_px->x % LCU_WIDTH, + luma_px->y % LCU_WIDTH + }; + const vector2d_t px = { + lcu_px.x >> is_chroma, + lcu_px.y >> is_chroma, + }; + + // Init pointers to LCUs reconstruction buffers, such that index 0 refers to block coordinate 0. + const kvz_pixel * __restrict left_ref = !color ? &lcu->left_ref.y[1] : (color == 1) ? &lcu->left_ref.u[1] : &lcu->left_ref.v[1]; + const kvz_pixel * __restrict top_ref = !color ? &lcu->top_ref.y[1] : (color == 1) ? &lcu->top_ref.u[1] : &lcu->top_ref.v[1]; + const kvz_pixel * __restrict rec_ref = !color ? lcu->rec.y : (color == 1) ? lcu->rec.u : lcu->rec.v; + + // Init top borders pointer to point to the correct place in the correct reference array. + const kvz_pixel * __restrict top_border; + if (px.y) { + top_border = &rec_ref[px.x + (px.y - 1) * (LCU_WIDTH >> is_chroma)]; + } else { + top_border = &top_ref[px.x]; + + } + + // Init left borders pointer to point to the correct place in the correct reference array. + const kvz_pixel * __restrict left_border; + int left_stride; // Distance between reference samples. + + // Generate top-left reference. + // If the block is at an LCU border, the top-left must be copied from + // the border that points to the LCUs 1D reference buffer. + if (px.x) { + left_border = &rec_ref[px.x - 1 + px.y * (LCU_WIDTH >> is_chroma)]; + left_stride = LCU_WIDTH >> is_chroma; + out_left_ref[0] = top_border[-1]; + out_top_ref[0] = top_border[-1]; + } else { + left_border = &left_ref[px.y]; + left_stride = 1; + out_left_ref[0] = left_border[-1 * left_stride]; + out_top_ref[0] = left_border[-1 * left_stride]; + } + + // Generate left reference. + + // Get the number of reference pixels based on the PU coordinate within the LCU. + int px_available_left = num_ref_pixels_left[lcu_px.y / 4][lcu_px.x / 4] >> is_chroma; + + // Limit the number of available pixels based on block size and dimensions + // of the picture. + px_available_left = MIN(px_available_left, width * 2); + px_available_left = MIN(px_available_left, (pic_px->y - luma_px->y) >> is_chroma); + + // Copy pixels from coded CUs. + int i = 0; + do { + out_left_ref[i + 1] = left_border[(i + 0) * left_stride]; + out_left_ref[i + 2] = left_border[(i + 1) * left_stride]; + out_left_ref[i + 3] = left_border[(i + 2) * left_stride]; + out_left_ref[i + 4] = left_border[(i + 3) * left_stride]; + i += 4; + } while (i < px_available_left); + + // Extend the last pixel for the rest of the reference values. + kvz_pixel nearest_pixel = out_left_ref[i]; + for (; i < width * 2; i += 4) { + out_left_ref[i + 1] = nearest_pixel; + out_left_ref[i + 2] = nearest_pixel; + out_left_ref[i + 3] = nearest_pixel; + out_left_ref[i + 4] = nearest_pixel; + } + + // Generate top reference. + + // Get the number of reference pixels based on the PU coordinate within the LCU. + int px_available_top = num_ref_pixels_top[lcu_px.y / 4][lcu_px.x / 4] >> is_chroma; +
View file
kvazaar-1.0.0.tar.gz/src/kvazaar.c -> kvazaar-1.1.0.tar.gz/src/kvazaar.c
Changed
@@ -50,7 +50,8 @@ } FREE_POINTER(encoder->states); - kvz_encoder_control_free(encoder->control); + // Discard const from the pointer. + kvz_encoder_control_free((void*) encoder->control); encoder->control = NULL; } FREE_POINTER(encoder); @@ -68,21 +69,17 @@ goto kvazaar_open_failure; } - kvz_init_exp_golomb(); - encoder = calloc(1, sizeof(kvz_encoder)); if (!encoder) { goto kvazaar_open_failure; } - // FIXME: const qualifier disgarded. I don't want to change kvazaar_open - // but I really need to change cfg. - encoder->control = kvz_encoder_control_init((kvz_config*)cfg); + encoder->control = kvz_encoder_control_init(cfg); if (!encoder->control) { goto kvazaar_open_failure; } - encoder->num_encoder_states = encoder->control->owf + 1; + encoder->num_encoder_states = encoder->control->cfg.owf + 1; encoder->cur_state_num = 0; encoder->out_state_num = 0; encoder->frames_started = 0; @@ -213,13 +210,13 @@ encoder_state_t *state = &enc->states[enc->cur_state_num]; - if (!state->prepared) { + if (!state->frame->prepared) { kvz_encoder_prepare(state); } if (pic_in != NULL) { // FIXME: The frame number printed here is wrong when GOP is enabled. - CHECKPOINT_MARK("read source frame: %d", state->frame->num + enc->control->cfg->seek); + CHECKPOINT_MARK("read source frame: %d", state->frame->num + enc->control->cfg.seek); } kvz_picture* frame = kvz_encoder_feed_frame(&enc->input_buffer, state, pic_in); @@ -235,13 +232,13 @@ return 1; } - if (!state->frame_done) { + if (!state->frame->done) { // We started encoding a frame; move to the next encoder state. enc->cur_state_num = (enc->cur_state_num + 1) % (enc->num_encoder_states); } encoder_state_t *output_state = &enc->states[enc->out_state_num]; - if (!output_state->frame_done && + if (!output_state->frame->done && (pic_in == NULL || enc->cur_state_num == enc->out_state_num)) { kvz_threadqueue_waitfor(enc->control->threadqueue, output_state->tqj_bitstream_written); @@ -256,8 +253,8 @@ if (src_out) *src_out = kvz_image_copy_ref(output_state->tile->frame->source); if (info_out) set_frame_info(info_out, output_state); - output_state->frame_done = 1; - output_state->prepared = 0; + output_state->frame->done = 1; + output_state->frame->prepared = 0; enc->frames_done += 1; enc->out_state_num = (enc->out_state_num + 1) % (enc->num_encoder_states); @@ -275,7 +272,7 @@ kvz_picture **src_out, kvz_frame_info *info_out) { - if (enc->control->cfg->source_scan_type == KVZ_INTERLACING_NONE) { + if (enc->control->cfg.source_scan_type == KVZ_INTERLACING_NONE) { // For progressive, simply call the normal encoding function. return kvazaar_encode(enc, pic_in, data_out, len_out, pic_out, src_out, info_out); }
View file
kvazaar-1.0.0.tar.gz/src/kvazaar.h -> kvazaar-1.1.0.tar.gz/src/kvazaar.h
Changed
@@ -188,6 +188,16 @@ KVZ_CSP_444 = 3, }; +/** + * \brief Chroma subsampling format used for encoding. + * \since 3.15.0 + */ +enum kvz_slices { + KVZ_SLICES_NONE, + KVZ_SLICES_TILES = (1 << 0), /*!< \brief Put each tile in a slice. */ + KVZ_SLICES_WPP = (1 << 1), /*!< \brief Put each row in a slice. */ +}; + // Map from input format to chroma format. #define KVZ_FORMAT2CSP(format) ((enum kvz_chroma_format)"\0\1\2\3"[format]) @@ -319,6 +329,14 @@ } gop_lp_definition; int32_t implicit_rdpcm; /*!< \brief Enable implicit residual DPCM. */ + + struct { + int32_t width; + int32_t height; + uint8_t *dqps; + } roi; /*!< \since 3.14.0 \brief Map of delta QPs for region of interest coding. */ + + unsigned slices; /*!< \since 3.15.0 \brief How to map slices to frame. */ } kvz_config; /** @@ -347,6 +365,8 @@ enum kvz_interlacing interlacing; //!< \since 3.2.0 \brief Field order for interlaced pictures. enum kvz_chroma_format chroma_format; + + int32_t ref_pocs[16]; } kvz_picture; /** @@ -551,9 +571,6 @@ * * Only one encoder may be open at a time. * - * The caller must not modify the config between passing it to this function - * and calling encoder_close. - * * \param cfg encoder configuration * \return created encoder, or NULL if creation failed. */
View file
kvazaar-1.0.0.tar.gz/src/kvazaar_internal.h -> kvazaar-1.1.0.tar.gz/src/kvazaar_internal.h
Changed
@@ -37,7 +37,7 @@ struct encoder_control_t; struct kvz_encoder { - struct encoder_control_t* control; + const struct encoder_control_t* control; struct encoder_state_t* states; unsigned num_encoder_states;
View file
kvazaar-1.0.0.tar.gz/src/rate_control.c -> kvazaar-1.1.0.tar.gz/src/rate_control.c
Changed
@@ -27,167 +27,316 @@ static const int SMOOTHING_WINDOW = 40; +static const double MIN_LAMBDA = 0.1; +static const double MAX_LAMBDA = 10000; + +/** + * \brief Clip lambda value to a valid range. + */ +static double clip_lambda(double lambda) { + if (isnan(lambda)) return MAX_LAMBDA; + return CLIP(MIN_LAMBDA, MAX_LAMBDA, lambda); +} /** * \brief Update alpha and beta parameters. - * \param state the main encoder state * - * Sets global->rc_alpha and global->rc_beta of the encoder state. + * \param bits number of bits spent for coding the area + * \param pixels size of the area in pixels + * \param lambda_real lambda used for coding the area + * \param[in,out] alpha alpha parameter to update + * \param[in,out] beta beta parameter to update */ -static void update_rc_parameters(encoder_state_t * state) +static void update_parameters(uint32_t bits, + uint32_t pixels, + double lambda_real, + double *alpha, + double *beta) { - const encoder_control_t * const encoder = state->encoder_control; - - const double pixels_per_picture = encoder->in.width * encoder->in.height; - const double bpp = state->stats_bitstream_length * 8 / pixels_per_picture; - const double log_bpp = log(bpp); - - const double alpha_old = state->frame->rc_alpha; - const double beta_old = state->frame->rc_beta; - // lambda computed from real bpp - const double lambda_comp = CLIP(0.1, 10000, alpha_old * pow(bpp, beta_old)); - // lambda used in encoding - const double lambda_real = state->frame->cur_lambda_cost; + const double bpp = bits / (double)pixels; + const double lambda_comp = clip_lambda(*alpha * pow(bpp, *beta)); const double lambda_log_ratio = log(lambda_real) - log(lambda_comp); - const double alpha = alpha_old + 0.1 * lambda_log_ratio * alpha_old; - state->frame->rc_alpha = CLIP(0.05, 20, alpha); + *alpha += 0.10 * lambda_log_ratio * (*alpha); + *alpha = CLIP(0.05, 20, *alpha); - const double beta = beta_old + 0.05 * lambda_log_ratio * CLIP(-5, 1, log_bpp); - state->frame->rc_beta = CLIP(-3, -0.1, beta); + *beta += 0.05 * lambda_log_ratio * CLIP(-5.0, -1.0, log(bpp)); + *beta = CLIP(-3, -0.1, *beta); } /** * \brief Allocate bits for the current GOP. - * \param state the main encoder state - * - * If GOPs are not used, allocates bits for a single picture. - * - * Sets the cur_gop_target_bits of the encoder state. + * \param state the main encoder state + * \return target number of bits */ -static void gop_allocate_bits(encoder_state_t * const state) +static double gop_allocate_bits(encoder_state_t * const state) { const encoder_control_t * const encoder = state->encoder_control; // At this point, total_bits_coded of the current state contains the // number of bits written encoder->owf frames before the current frame. uint64_t bits_coded = state->frame->total_bits_coded; - int pictures_coded = MAX(0, state->frame->num - encoder->owf); + int pictures_coded = MAX(0, state->frame->num - encoder->cfg.owf); - int gop_offset = (state->frame->gop_offset - encoder->owf) % MAX(1, encoder->cfg->gop_len); + int gop_offset = (state->frame->gop_offset - encoder->cfg.owf) % MAX(1, encoder->cfg.gop_len); // Only take fully coded GOPs into account. - if (encoder->cfg->gop_len > 0 && gop_offset != encoder->cfg->gop_len - 1) { + if (encoder->cfg.gop_len > 0 && gop_offset != encoder->cfg.gop_len - 1) { // Subtract number of bits in the partially coded GOP. bits_coded -= state->frame->cur_gop_bits_coded; // Subtract number of pictures in the partially coded GOP. pictures_coded -= gop_offset + 1; } + // Equation 12 from https://doi.org/10.1109/TIP.2014.2336550 double gop_target_bits = (encoder->target_avg_bppic * (pictures_coded + SMOOTHING_WINDOW) - bits_coded) - * MAX(1, encoder->cfg->gop_len) / SMOOTHING_WINDOW; - state->frame->cur_gop_target_bits = MAX(200, gop_target_bits); + * MAX(1, encoder->cfg.gop_len) / SMOOTHING_WINDOW; + // Allocate at least 200 bits for each GOP like HM does. + return MAX(200, gop_target_bits); } /** - * Allocate bits for the current picture. - * \param state the main encoder state - * \return target number of bits + * Estimate number of bits used for headers of the current picture. + * \param state the main encoder state + * \return number of header bits */ -static double pic_allocate_bits(const encoder_state_t * const state) +static uint64_t pic_header_bits(encoder_state_t * const state) { - const encoder_control_t * const encoder = state->encoder_control; + const kvz_config* cfg = &state->encoder_control->cfg; - if (encoder->cfg->gop_len <= 0) { - return state->frame->cur_gop_target_bits; + // nal type and slice header + uint64_t bits = 48 + 24; + + // entry points + bits += 12 * state->encoder_control->in.height_in_lcu; + + switch (cfg->hash) { + case KVZ_HASH_CHECKSUM: + bits += 168; + break; + + case KVZ_HASH_MD5: + bits += 456; + break; + + case KVZ_HASH_NONE: + break; } - const double pic_weight = encoder->gop_layer_weights[ - encoder->cfg->gop[state->frame->gop_offset].layer - 1]; - double pic_target_bits = state->frame->cur_gop_target_bits * pic_weight; - return MAX(100, pic_target_bits); + if (encoder_state_must_write_vps(state)) { + bits += 613; + } + + if (state->frame->num == 0 && cfg->add_encoder_info) { + bits += 1392; + } + + return bits; } /** - * \brief Select a lambda value for encoding the next picture - * \param state the main encoder state - * \return lambda for the next picture - * - * Rate control must be enabled (i.e. cfg->target_bitrate > 0) when this - * function is called. + * Allocate bits for the current picture. + * \param state the main encoder state + * \return target number of bits, excluding headers */ -double kvz_select_picture_lambda(encoder_state_t * const state) +static double pic_allocate_bits(encoder_state_t * const state) { const encoder_control_t * const encoder = state->encoder_control; - assert(encoder->cfg->target_bitrate > 0); - - if (state->frame->num > encoder->owf) { - // At least one frame has been written. - update_rc_parameters(state); - } - - if (encoder->cfg->gop_len == 0 || + if (encoder->cfg.gop_len == 0 || state->frame->gop_offset == 0 || state->frame->num == 0) { - // A new GOP begins at this frame. - gop_allocate_bits(state); + // A new GOP starts at this frame. + state->frame->cur_gop_target_bits = gop_allocate_bits(state); + state->frame->cur_gop_bits_coded = 0; } else { state->frame->cur_gop_target_bits = state->previous_encoder_state->frame->cur_gop_target_bits; } - // TODO: take the picture headers into account - const double target_bits_current_picture = pic_allocate_bits(state); - const double target_bits_per_pixel = - target_bits_current_picture / encoder->in.pixels_per_pic; - const double lambda = - state->frame->rc_alpha * pow(target_bits_per_pixel, state->frame->rc_beta); - return CLIP(0.1, 10000, lambda); + if (encoder->cfg.gop_len <= 0) { + return state->frame->cur_gop_target_bits; + }
View file
kvazaar-1.0.0.tar.gz/src/rate_control.h -> kvazaar-1.1.0.tar.gz/src/rate_control.h
Changed
@@ -30,11 +30,9 @@ #include "encoderstate.h" +void kvz_set_picture_lambda_and_qp(encoder_state_t * const state); -double kvz_select_picture_lambda(encoder_state_t * const state); - -int8_t kvz_lambda_to_QP(const double lambda); - -double kvz_select_picture_lambda_from_qp(encoder_state_t const * const state); +void kvz_set_lcu_lambda_and_qp(encoder_state_t * const state, + vector2d_t pos); #endif // RATE_CONTROL_H_
View file
kvazaar-1.0.0.tar.gz/src/rdo.c -> kvazaar-1.1.0.tar.gz/src/rdo.c
Changed
@@ -127,6 +127,19 @@ }; +// This struct is for passing data to kvz_rdoq_sign_hiding +struct sh_rates_t { + // Bit cost of increasing rate by one. + int32_t inc[32 * 32]; + // Bit cost of decreasing rate by one. + int32_t dec[32 * 32]; + // Bit cost of going from zero to one. + int32_t sig_coeff_inc[32 * 32]; + // Coeff minus quantized coeff. + int32_t quant_delta[32 * 32]; +}; + + /** Calculate actual (or really close to actual) bitcost for coding coefficients * \param coeff coefficient array * \param width coeff block width @@ -188,7 +201,7 @@ int8_t type) { cabac_data_t * const cabac = &state->cabac; - int32_t rate = 32768; + int32_t rate = 1 << CTX_FRAC_BITS; uint32_t base_level = (c1_idx < C1FLAG_NUMBER)? (2 + (c2_idx < C2FLAG_NUMBER)) : 1; cabac_ctx_t *base_one_ctx = (type == 0) ? &(cabac->ctx.cu_one_model_luma[0]) : &(cabac->ctx.cu_one_model_chroma[0]); cabac_ctx_t *base_abs_ctx = (type == 0) ? &(cabac->ctx.cu_abs_model_luma[0]) : &(cabac->ctx.cu_abs_model_chroma[0]); @@ -198,14 +211,14 @@ int32_t length; if (symbol < (COEF_REMAIN_BIN_REDUCTION << abs_go_rice)) { length = symbol>>abs_go_rice; - rate += (length+1+abs_go_rice)<< 15; + rate += (length+1+abs_go_rice) << CTX_FRAC_BITS; } else { length = abs_go_rice; symbol = symbol - ( COEF_REMAIN_BIN_REDUCTION << abs_go_rice); while (symbol >= (1<<length)) { symbol -= (1<<(length++)); } - rate += (COEF_REMAIN_BIN_REDUCTION+length+1-abs_go_rice+length)<< 15; + rate += (COEF_REMAIN_BIN_REDUCTION+length+1-abs_go_rice+length) << CTX_FRAC_BITS; } if (c1_idx < C1FLAG_NUMBER) { rate += CTX_ENTROPY_BITS(&base_one_ctx[ctx_num_one],1); @@ -257,7 +270,7 @@ cabac_ctx_t* base_sig_model = type?(cabac->ctx.cu_sig_model_chroma):(cabac->ctx.cu_sig_model_luma); if( !last && max_abs_level < 3 ) { - *coded_cost_sig = state->frame->cur_lambda_cost * CTX_ENTROPY_BITS(&base_sig_model[ctx_num_sig], 0); + *coded_cost_sig = state->lambda * CTX_ENTROPY_BITS(&base_sig_model[ctx_num_sig], 0); *coded_cost = *coded_cost0 + *coded_cost_sig; if (max_abs_level == 0) return best_abs_level; } else { @@ -265,13 +278,13 @@ } if( !last ) { - cur_cost_sig = state->frame->cur_lambda_cost * CTX_ENTROPY_BITS(&base_sig_model[ctx_num_sig], 1); + cur_cost_sig = state->lambda * CTX_ENTROPY_BITS(&base_sig_model[ctx_num_sig], 1); } min_abs_level = ( max_abs_level > 1 ? max_abs_level - 1 : 1 ); for (abs_level = max_abs_level; abs_level >= min_abs_level ; abs_level-- ) { double err = (double)(level_double - ( abs_level << q_bits ) ); - double cur_cost = err * err * temp + state->frame->cur_lambda_cost * + double cur_cost = err * err * temp + state->lambda * kvz_get_ic_rate( state, abs_level, ctx_num_one, ctx_num_abs, abs_go_rice, c1_idx, c2_idx, type); cur_cost += cur_cost_sig; @@ -303,12 +316,12 @@ uint32_t ctx_y = g_group_idx[pos_y]; double uiCost = last_x_bits[ ctx_x ] + last_y_bits[ ctx_y ]; if( ctx_x > 3 ) { - uiCost += 32768.0 * ((ctx_x-2)>>1); + uiCost += CTX_FRAC_ONE_BIT * ((ctx_x - 2) >> 1); } if( ctx_y > 3 ) { - uiCost += 32768.0 * ((ctx_y-2)>>1); + uiCost += CTX_FRAC_ONE_BIT * ((ctx_y - 2) >> 1); } - return state->frame->cur_lambda_cost*uiCost; + return state->lambda * uiCost; } static void calc_last_bits(encoder_state_t * const state, int32_t width, int32_t height, int8_t type, @@ -342,109 +355,147 @@ last_y_bits[ctx] = bits_y; } - -void kvz_rdoq_sign_hiding(const encoder_state_t *const state, - const int32_t qp_scaled, - const uint32_t *const scan, - const int32_t delta_u[32 * 32], - const int32_t rate_inc_up[32 * 32], - const int32_t rate_inc_down[32 * 32], - const int32_t sig_rate_delta[32 * 32], - const int32_t width, - const coeff_t *const coef, - coeff_t *const dest_coeff) +/** + * \brief Select which coefficient to change for sign hiding, and change it. + * + * When sign hiding is enabled, the last sign bit of the last coefficient is + * calculated from the parity of the other coefficients. If the parity is not + * correct, one coefficient has to be changed by one. This function uses + * tables generated during RDOQ to select the best coefficient to change. + */ +void kvz_rdoq_sign_hiding( + const encoder_state_t *const state, + const int32_t qp_scaled, + const uint32_t *const scan2raster, + const struct sh_rates_t *const sh_rates, + const int32_t last_pos, + const coeff_t *const coeffs, + coeff_t *const quant_coeffs) { - const encoder_control_t * const encoder = state->encoder_control; - - int64_t rd_factor = (int64_t)( - kvz_g_inv_quant_scales[qp_scaled % 6] * kvz_g_inv_quant_scales[qp_scaled % 6] * (1 << (2 * (qp_scaled / 6))) - / state->frame->cur_lambda_cost / 16 / (1 << (2 * (encoder->bitdepth - 8))) - + 0.5); - int32_t lastCG = -1; - int32_t absSum = 0; - - for (int32_t subset = (width - 1) >> LOG2_SCAN_SET_SIZE; subset >= 0; subset--) { - int32_t subPos = subset << LOG2_SCAN_SET_SIZE; - int32_t firstNZPosInCG = SCAN_SET_SIZE, lastNZPosInCG = -1; - absSum = 0; - - for (int32_t n = SCAN_SET_SIZE - 1; n >= 0; --n) { - if (dest_coeff[scan[n + subPos]]) { - lastNZPosInCG = n; + const encoder_control_t * const ctrl = state->encoder_control; + + int inv_quant = kvz_g_inv_quant_scales[qp_scaled % 6]; + // This somehow scales quant_delta into fractional bits. Instead of the bits + // being multiplied by lambda, the residual is divided by it, or something + // like that. + const int64_t rd_factor = (inv_quant * inv_quant * (1 << (2 * (qp_scaled / 6))) + / state->lambda / 16 / (1 << (2 * (ctrl->bitdepth - 8))) + 0.5); + const int last_cg = (last_pos - 1) >> LOG2_SCAN_SET_SIZE; + + for (int32_t cg_scan = last_cg; cg_scan >= 0; cg_scan--) { + const int32_t cg_coeff_scan = cg_scan << LOG2_SCAN_SET_SIZE; + + // Find positions of first and last non-zero coefficients in the CG. + int32_t last_nz_scan = -1; + for (int32_t coeff_i = SCAN_SET_SIZE - 1; coeff_i >= 0; --coeff_i) { + if (quant_coeffs[scan2raster[coeff_i + cg_coeff_scan]]) { + last_nz_scan = coeff_i; break; } } - - for (int32_t n = 0; n <= lastNZPosInCG; n++) { - if (dest_coeff[scan[n + subPos]]) { - firstNZPosInCG = n; + int32_t first_nz_scan = SCAN_SET_SIZE; + for (int32_t coeff_i = 0; coeff_i <= last_nz_scan; coeff_i++) { + if (quant_coeffs[scan2raster[coeff_i + cg_coeff_scan]]) { + first_nz_scan = coeff_i; break; } } - for (int32_t n = firstNZPosInCG; n <= lastNZPosInCG; n++) { - absSum += dest_coeff[scan[n + subPos]]; + if (last_nz_scan - first_nz_scan < SBH_THRESHOLD) { + continue; } - if (lastNZPosInCG >= 0 && lastCG == -1) lastCG = 1; + const int32_t signbit = quant_coeffs[scan2raster[cg_coeff_scan + first_nz_scan]] <= 0; + unsigned abs_coeff_sum = 0; + for (int32_t coeff_scan = first_nz_scan; coeff_scan <= last_nz_scan; coeff_scan++) { + abs_coeff_sum += quant_coeffs[scan2raster[coeff_scan + cg_coeff_scan]]; + } + if (signbit == (abs_coeff_sum & 0x1)) { + // Sign already matches with the parity, no need to modify coefficients. + continue; + } - if (lastNZPosInCG - firstNZPosInCG >= SBH_THRESHOLD) { - int32_t signbit = (dest_coeff[scan[subPos + firstNZPosInCG]]>0 ? 0 : 1); - if (signbit != (absSum & 0x1)) { // hide but need tune - // calculate the cost - int64_t minCostInc = MAX_INT64, curCost = MAX_INT64; - int32_t minPos = -1, finalChange = 0, curChange = 0; + // Otherwise, search for the best coeff to change by one and change it. - for (int32_t n = (lastCG == 1 ? lastNZPosInCG : SCAN_SET_SIZE - 1); n >= 0; --n) { - uint32_t blkpos = scan[n + subPos]; - if (dest_coeff[blkpos] != 0) { - int64_t costUp = rd_factor * (-delta_u[blkpos]) + rate_inc_up[blkpos]; - int64_t costDown = rd_factor * (delta_u[blkpos]) + rate_inc_down[blkpos] - - (abs(dest_coeff[blkpos]) == 1 ? ((1 << 15) + sig_rate_delta[blkpos]) : 0);
View file
kvazaar-1.0.0.tar.gz/src/rdo.h -> kvazaar-1.1.0.tar.gz/src/rdo.h
Changed
@@ -54,7 +54,11 @@ uint32_t kvz_get_mvd_coding_cost_cabac(encoder_state_t * const state, vector2d_t *mvd, const cabac_data_t* cabac); -// Fixed points fractional bits, 16b.16b +// Number of fixed point fractional bits used in the fractional bit table. +#define CTX_FRAC_BITS 15 +#define CTX_FRAC_ONE_BIT (1 << CTX_FRAC_BITS) +#define CTX_FRAC_HALF_BIT (1 << (CTX_FRAC_BITS - 1)) + extern const uint32_t kvz_entropy_bits[128]; #define CTX_ENTROPY_BITS(ctx, val) kvz_entropy_bits[(ctx)->uc_state ^ (val)]
View file
kvazaar-1.0.0.tar.gz/src/sao.c -> kvazaar-1.1.0.tar.gz/src/sao.c
Changed
@@ -501,7 +501,7 @@ { float mode_bits = sao_mode_bits_edge(state, edge_class, edge_offset, sao_top, sao_left, buf_cnt); - sum_ddistortion += (int)((double)mode_bits*state->frame->cur_lambda_cost+0.5); + sum_ddistortion += (int)((double)mode_bits*state->lambda +0.5); } // SAO is not applied for category 0. edge_offset[SAO_EO_CAT0] = 0; @@ -545,7 +545,7 @@ } temp_rate = sao_mode_bits_band(state, sao_out->band_position, temp_offsets, sao_top, sao_left, buf_cnt); - ddistortion += (int)((double)temp_rate*state->frame->cur_lambda_cost + 0.5); + ddistortion += (int)((double)temp_rate*state->lambda + 0.5); // Select band sao over edge sao when distortion is lower if (ddistortion < sao_out->ddistortion) { @@ -589,7 +589,7 @@ { float mode_bits = sao_mode_bits_edge(state, edge_sao.eo_class, edge_sao.offsets, sao_top, sao_left, buf_cnt); - int ddistortion = (int)(mode_bits * state->frame->cur_lambda_cost + 0.5); + int ddistortion = (int)(mode_bits * state->lambda + 0.5); unsigned buf_i; for (buf_i = 0; buf_i < buf_cnt; ++buf_i) { @@ -603,7 +603,7 @@ { float mode_bits = sao_mode_bits_band(state, band_sao.band_position, band_sao.offsets, sao_top, sao_left, buf_cnt); - int ddistortion = (int)(mode_bits * state->frame->cur_lambda_cost + 0.5); + int ddistortion = (int)(mode_bits * state->lambda + 0.5); unsigned buf_i; for (buf_i = 0; buf_i < buf_cnt; ++buf_i) { @@ -626,7 +626,7 @@ // Choose between SAO and doing nothing, taking into account the // rate-distortion cost of coding do nothing. { - int cost_of_nothing = (int)(sao_mode_bits_none(state, sao_top, sao_left) * state->frame->cur_lambda_cost + 0.5); + int cost_of_nothing = (int)(sao_mode_bits_none(state, sao_top, sao_left) * state->lambda + 0.5); if (sao_out->ddistortion >= cost_of_nothing) { sao_out->type = SAO_TYPE_NONE; merge_cost[0] = cost_of_nothing; @@ -643,7 +643,7 @@ if (merge_cand) { unsigned buf_i; float mode_bits = sao_mode_bits_merge(state, i + 1); - int ddistortion = (int)(mode_bits * state->frame->cur_lambda_cost + 0.5); + int ddistortion = (int)(mode_bits * state->lambda + 0.5); switch (merge_cand->type) { case SAO_TYPE_EDGE: @@ -741,7 +741,7 @@ void kvz_sao_search_lcu(const encoder_state_t* const state, int lcu_x, int lcu_y) { - assert(!state->encoder_control->cfg->lossless); + assert(!state->encoder_control->cfg.lossless); videoframe_t* const frame = state->tile->frame; const int stride = frame->width_in_lcu;
View file
kvazaar-1.0.0.tar.gz/src/scalinglist.c -> kvazaar-1.1.0.tar.gz/src/scalinglist.c
Changed
@@ -23,6 +23,7 @@ #include <string.h> #include "scalinglist.h" +#include "rdo.h" #include "tables.h" @@ -345,7 +346,7 @@ double *err_scale = (double *) scaling_list->error_scale[size][list][qp]; // Compensate for scaling of bitcount in Lagrange cost function - double scale = (double)(1<<15); + double scale = CTX_FRAC_ONE_BIT; // Compensate for scaling through forward transform scale = scale*pow(2.0,-2.0*transform_shift); for(i=0;i<max_num_coeff;i++) {
View file
kvazaar-1.0.0.tar.gz/src/search.c -> kvazaar-1.1.0.tar.gz/src/search.c
Changed
@@ -35,6 +35,7 @@ #include "threadqueue.h" #include "transform.h" #include "videoframe.h" +#include "strategies/strategies-picture.h" #define IN_FRAME(x, y, width, height, block_width, block_height) \ @@ -320,7 +321,7 @@ sum += kvz_cu_rd_cost_luma(state, x_px, y_px + offset, depth + 1, pred_cu, lcu); sum += kvz_cu_rd_cost_luma(state, x_px + offset, y_px + offset, depth + 1, pred_cu, lcu); - return sum + tr_tree_bits * state->frame->cur_lambda_cost; + return sum + tr_tree_bits * state->lambda; } // Add transform_tree cbf_luma bit cost. @@ -335,7 +336,7 @@ // SSD between reconstruction and original int ssd = 0; - if (!state->encoder_control->cfg->lossless) { + if (!state->encoder_control->cfg.lossless) { int index = y_px * LCU_WIDTH + x_px; ssd = kvz_pixels_calc_ssd(&lcu->ref.y[index], &lcu->rec.y[index], LCU_WIDTH, LCU_WIDTH, @@ -352,7 +353,7 @@ } double bits = tr_tree_bits + coeff_bits; - return (double)ssd * LUMA_MULT + bits * state->frame->cur_lambda_cost; + return (double)ssd * LUMA_MULT + bits * state->lambda; } @@ -397,12 +398,12 @@ sum += kvz_cu_rd_cost_chroma(state, x_px, y_px + offset, depth + 1, pred_cu, lcu); sum += kvz_cu_rd_cost_chroma(state, x_px + offset, y_px + offset, depth + 1, pred_cu, lcu); - return sum + tr_tree_bits * state->frame->cur_lambda_cost; + return sum + tr_tree_bits * state->lambda; } // Chroma SSD int ssd = 0; - if (!state->encoder_control->cfg->lossless) { + if (!state->encoder_control->cfg.lossless) { int index = lcu_px.y * LCU_WIDTH_C + lcu_px.x; int ssd_u = kvz_pixels_calc_ssd(&lcu->ref.u[index], &lcu->rec.u[index], LCU_WIDTH_C, LCU_WIDTH_C, @@ -427,7 +428,7 @@ } double bits = tr_tree_bits + coeff_bits; - return (double)ssd * CHROMA_MULT + bits * state->frame->cur_lambda_cost; + return (double)ssd * CHROMA_MULT + bits * state->lambda; } @@ -515,7 +516,7 @@ bool can_use_inter = state->frame->slicetype != KVZ_SLICE_I - && WITHIN(depth, ctrl->pu_depth_inter.min, ctrl->pu_depth_inter.max); + && WITHIN(depth, ctrl->cfg.pu_depth_inter.min, ctrl->cfg.pu_depth_inter.max); if (can_use_inter) { double mode_cost; @@ -540,8 +541,8 @@ SIZE_nLx2N, SIZE_nRx2N, }; - const int first_mode = ctrl->cfg->smp_enable ? 0 : 2; - const int last_mode = (ctrl->cfg->amp_enable && cu_width >= 16) ? 5 : 1; + const int first_mode = ctrl->cfg.smp_enable ? 0 : 2; + const int last_mode = (ctrl->cfg.amp_enable && cu_width >= 16) ? 5 : 1; for (int i = first_mode; i <= last_mode; ++i) { kvz_search_cu_smp(state, x, y, @@ -562,11 +563,11 @@ // Try to skip intra search in rd==0 mode. // This can be quite severe on bdrate. It might be better to do this // decision after reconstructing the inter frame. - bool skip_intra = state->encoder_control->rdo == 0 + bool skip_intra = state->encoder_control->cfg.rdo == 0 && cur_cu->type != CU_NOTSET && cost / (cu_width * cu_width) < INTRA_TRESHOLD; if (!skip_intra - && WITHIN(depth, ctrl->pu_depth_intra.min, ctrl->pu_depth_intra.max)) + && WITHIN(depth, ctrl->cfg.pu_depth_intra.min, ctrl->cfg.pu_depth_intra.max)) { int8_t intra_mode; double intra_cost; @@ -598,7 +599,7 @@ // rd2. Possibly because the luma mode search already takes chroma // into account, so there is less of a chanse of luma mode being // really bad for chroma. - if (state->encoder_control->rdo == 3) { + if (state->encoder_control->cfg.rdo == 3) { intra_mode_chroma = kvz_search_cu_intra_chroma(state, x, y, depth, &work_tree[depth]); lcu_set_intra_mode(&work_tree[depth], x, y, depth, intra_mode, intra_mode_chroma, @@ -681,11 +682,13 @@ mode_bits = inter_bitcost; } - cost += mode_bits * state->frame->cur_lambda_cost; + cost += mode_bits * state->lambda; } // Recursively split all the way to max search depth. - if (depth < ctrl->pu_depth_intra.max || (depth < ctrl->pu_depth_inter.max && state->frame->slicetype != KVZ_SLICE_I)) { + if (depth < ctrl->cfg.pu_depth_intra.max || + (depth < ctrl->cfg.pu_depth_inter.max && state->frame->slicetype != KVZ_SLICE_I)) + { int half_cu = cu_width / 2; double split_cost = 0.0; int cbf = cbf_is_set_any(cur_cu->cbf, depth); @@ -694,25 +697,27 @@ // Add cost of cu_split_flag. uint8_t split_model = get_ctx_cu_split_model(lcu, x, y, depth); const cabac_ctx_t *ctx = &(state->cabac.ctx.split_flag_model[split_model]); - cost += CTX_ENTROPY_FBITS(ctx, 0) * state->frame->cur_lambda_cost; - split_cost += CTX_ENTROPY_FBITS(ctx, 1) * state->frame->cur_lambda_cost; + cost += CTX_ENTROPY_FBITS(ctx, 0) * state->lambda; + split_cost += CTX_ENTROPY_FBITS(ctx, 1) * state->lambda; } if (cur_cu->type == CU_INTRA && depth == MAX_DEPTH) { // Add cost of intra part_size. const cabac_ctx_t *ctx = &(state->cabac.ctx.part_size_model[0]); - cost += CTX_ENTROPY_FBITS(ctx, 1) * state->frame->cur_lambda_cost; // 2Nx2N - split_cost += CTX_ENTROPY_FBITS(ctx, 0) * state->frame->cur_lambda_cost; // NxN + cost += CTX_ENTROPY_FBITS(ctx, 1) * state->lambda; // 2Nx2N + split_cost += CTX_ENTROPY_FBITS(ctx, 0) * state->lambda; // NxN } // If skip mode was selected for the block, skip further search. // Skip mode means there's no coefficients in the block, so splitting // might not give any better results but takes more time to do. - if (cur_cu->type == CU_NOTSET || cbf || state->encoder_control->cfg->cu_split_termination == KVZ_CU_SPLIT_TERMINATION_OFF) { - split_cost += search_cu(state, x, y, depth + 1, work_tree); - split_cost += search_cu(state, x + half_cu, y, depth + 1, work_tree); - split_cost += search_cu(state, x, y + half_cu, depth + 1, work_tree); - split_cost += search_cu(state, x + half_cu, y + half_cu, depth + 1, work_tree); + // It is ok to interrupt the search as soon as it is known that + // the split costs at least as much as not splitting. + if (cur_cu->type == CU_NOTSET || cbf || state->encoder_control->cfg.cu_split_termination == KVZ_CU_SPLIT_TERMINATION_OFF) { + if (split_cost < cost) split_cost += search_cu(state, x, y, depth + 1, work_tree); + if (split_cost < cost) split_cost += search_cu(state, x + half_cu, y, depth + 1, work_tree); + if (split_cost < cost) split_cost += search_cu(state, x, y + half_cu, depth + 1, work_tree); + if (split_cost < cost) split_cost += search_cu(state, x + half_cu, y + half_cu, depth + 1, work_tree); } else { split_cost = INT_MAX; } @@ -732,7 +737,7 @@ cur_cu->intra = cu_d1->intra; cur_cu->type = CU_INTRA; - cur_cu->part_size = depth > MAX_DEPTH ? SIZE_NxN : SIZE_2Nx2N; + cur_cu->part_size = SIZE_2Nx2N; kvz_lcu_set_trdepth(&work_tree[depth], x, y, depth, cur_cu->tr_depth); lcu_set_intra_mode(&work_tree[depth], x, y, depth, @@ -749,11 +754,11 @@ // Add the cost of coding no-split. uint8_t split_model = get_ctx_cu_split_model(lcu, x, y, depth); const cabac_ctx_t *ctx = &(state->cabac.ctx.split_flag_model[split_model]); - cost += CTX_ENTROPY_FBITS(ctx, 0) * state->frame->cur_lambda_cost; + cost += CTX_ENTROPY_FBITS(ctx, 0) * state->lambda; // Add the cost of coding intra mode only once. double mode_bits = calc_mode_bits(state, &work_tree[depth], cur_cu, x, y); - cost += mode_bits * state->frame->cur_lambda_cost; + cost += mode_bits * state->lambda; } } @@ -948,7 +953,10 @@ } // Start search from depth 0. - search_cu(state, x, y, 0, work_tree); + double cost = search_cu(state, x, y, 0, work_tree); + + // Save squared cost for rate control. + kvz_get_lcu_stats(state, x / LCU_WIDTH, y / LCU_WIDTH)->weight = cost * cost; // The best decisions through out the LCU got propagated back to depth 0, // so copy those back to the frame.
View file
kvazaar-1.0.0.tar.gz/src/search_inter.c -> kvazaar-1.1.0.tar.gz/src/search_inter.c
Changed
@@ -40,12 +40,12 @@ */ static INLINE bool fracmv_within_tile(const encoder_state_t *state, const vector2d_t* orig, int x, int y, int width, int height, int wpp_limit) { - if (state->encoder_control->cfg->mv_constraint == KVZ_MV_CONSTRAIN_NONE) { + if (state->encoder_control->cfg.mv_constraint == KVZ_MV_CONSTRAIN_NONE) { return (wpp_limit == -1 || y + (height << 2) <= (wpp_limit << 2)); }; int margin = 0; - if (state->encoder_control->cfg->mv_constraint == KVZ_MV_CONSTRAIN_FRAME_AND_TILE_MARGIN) { + if (state->encoder_control->cfg.mv_constraint == KVZ_MV_CONSTRAIN_FRAME_AND_TILE_MARGIN) { // Enforce a distance of 8 from any tile boundary. margin = 4 * 4; } @@ -68,14 +68,14 @@ static INLINE int get_wpp_limit(const encoder_state_t *state, const vector2d_t* orig) { const encoder_control_t *ctrl = state->encoder_control; - if (ctrl->owf && ctrl->wpp) { + if (ctrl->cfg.owf && ctrl->cfg.wpp) { // Limit motion vectors to the LCU-row below this row. // To avoid fractional pixel interpolation depending on things outside // this range, add a margin of 4 pixels. // - fme needs 4 pixels // - odd chroma interpolation needs 4 pixels int wpp_limit = 2 * LCU_WIDTH - 4 - orig->y % LCU_WIDTH; - if (ctrl->deblock_enable && !ctrl->sao_enable) { + if (ctrl->cfg.deblock_enable && !ctrl->cfg.sao_enable) { // As a special case, when deblocking is enabled but SAO is not, we have // to avoid the possibility of interpolation filters reaching the // non-deblocked pixels. The deblocking for the horizontal edge on the @@ -191,22 +191,22 @@ if (abs_mvd.x > 0) { bitcost += CTX_ENTROPY_BITS(&cabac->ctx.cu_mvd_model[1], abs_mvd.x > 1); if (abs_mvd.x > 1) { - bitcost += get_ep_ex_golomb_bitcost(abs_mvd.x - 2) << 15; + bitcost += get_ep_ex_golomb_bitcost(abs_mvd.x - 2) << CTX_FRAC_BITS; } - bitcost += 1 << 15; // sign + bitcost += CTX_FRAC_ONE_BIT; // sign } bitcost += CTX_ENTROPY_BITS(&cabac->ctx.cu_mvd_model[0], abs_mvd.y > 0); if (abs_mvd.y > 0) { bitcost += CTX_ENTROPY_BITS(&cabac->ctx.cu_mvd_model[1], abs_mvd.y > 1); if (abs_mvd.y > 1) { - bitcost += get_ep_ex_golomb_bitcost(abs_mvd.y - 2) << 15; + bitcost += get_ep_ex_golomb_bitcost(abs_mvd.y - 2) << CTX_FRAC_BITS; } - bitcost += 1 << 15; // sign + bitcost += CTX_FRAC_ONE_BIT; // sign } // Round and shift back to integer bits. - return (bitcost + (1 << 14)) >> 15; + return (bitcost + CTX_FRAC_HALF_BIT) >> CTX_FRAC_BITS; } @@ -253,7 +253,7 @@ temp_bitcost += cur_mv_cand ? cand2_cost : cand1_cost; } *bitcost = temp_bitcost; - return temp_bitcost*(int32_t)(state->frame->cur_lambda_cost_sqrt+0.5); + return temp_bitcost*(int32_t)(state->lambda_sqrt + 0.5); } @@ -268,7 +268,7 @@ }; double multiplier = 1; // If early termination is set to fast set multiplier to 0.9 - if (state->encoder_control->cfg->me_early_termination == KVZ_ME_EARLY_TERMINATION_SENSITIVE){ + if (state->encoder_control->cfg.me_early_termination == KVZ_ME_EARLY_TERMINATION_SENSITIVE){ multiplier = 0.95; } const vector2d_t *offset; @@ -324,7 +324,7 @@ kvz_mvd_cost_func *calc_mvd = calc_mvd_cost; - if (state->encoder_control->cfg->mv_rdo) { + if (state->encoder_control->cfg.mv_rdo) { calc_mvd = kvz_calc_mvd_cost_cabac; } @@ -479,7 +479,7 @@ vector2d_t mv_best = { 0, 0 }; kvz_mvd_cost_func *calc_mvd = calc_mvd_cost; - if (state->encoder_control->cfg->mv_rdo) { + if (state->encoder_control->cfg.mv_rdo) { calc_mvd = kvz_calc_mvd_cost_cabac; } @@ -548,7 +548,7 @@ int wpp_limit = get_wpp_limit(state, orig); kvz_mvd_cost_func *calc_mvd = calc_mvd_cost; - if (state->encoder_control->cfg->mv_rdo) { + if (state->encoder_control->cfg.mv_rdo) { calc_mvd = kvz_calc_mvd_cost_cabac; } @@ -585,7 +585,7 @@ pic, ref, mv_cand, ref_idx, best_cost, &best_index, &best_bitcost, calc_mvd); // Check if we should stop search - if (state->encoder_control->cfg->me_early_termination){ + if (state->encoder_control->cfg.me_early_termination){ if (early_terminate(num_cand, merge_cand, mv_in_out, &mv, state, orig, width, height, wpp_limit, pic, ref, mv_cand, ref_idx, &best_cost, bitcost_out, &best_bitcost, calc_mvd)) return best_cost; } @@ -700,7 +700,7 @@ int wpp_limit = get_wpp_limit(state, orig); kvz_mvd_cost_func *calc_mvd = calc_mvd_cost; - if (state->encoder_control->cfg->mv_rdo) { + if (state->encoder_control->cfg.mv_rdo) { calc_mvd = kvz_calc_mvd_cost_cabac; } @@ -738,7 +738,7 @@ pic, ref, mv_cand, ref_idx, best_cost, &best_index, &best_bitcost, calc_mvd); // Check if we should stop search - if (state->encoder_control->cfg->me_early_termination){ + if (state->encoder_control->cfg.me_early_termination){ if (early_terminate(num_cand, merge_cand, mv_in_out, &mv, state, orig, width, height, wpp_limit, pic, ref, mv_cand, ref_idx, &best_cost, bitcost_out, &best_bitcost, calc_mvd)) return best_cost; } @@ -855,7 +855,7 @@ int wpp_limit = get_wpp_limit(state, orig); kvz_mvd_cost_func *calc_mvd = calc_mvd_cost; - if (state->encoder_control->cfg->mv_rdo) { + if (state->encoder_control->cfg.mv_rdo) { calc_mvd = kvz_calc_mvd_cost_cabac; } @@ -1043,10 +1043,10 @@ hpel_pos[6] = fracpel_blocks[HPEL_POS_DIA] + (LCU_WIDTH + 1); hpel_pos[7] = fracpel_blocks[HPEL_POS_DIA] + (LCU_WIDTH + 1) + 1; - int fme_level = state->encoder_control->fme_level; + int fme_level = state->encoder_control->cfg.fme_level; kvz_mvd_cost_func *calc_mvd = calc_mvd_cost; - if (state->encoder_control->cfg->mv_rdo) { + if (state->encoder_control->cfg.mv_rdo) { calc_mvd = kvz_calc_mvd_cost_cabac; } @@ -1059,7 +1059,7 @@ height, fracpel_blocks, fme_level); kvz_pixel tmp_pic[LCU_WIDTH*LCU_WIDTH]; - kvz_pixels_blit(pic->y + orig->y*pic->width + orig->x, tmp_pic, width, height, pic->stride, width); + kvz_pixels_blit(pic->y + orig->y * pic->stride + orig->x, tmp_pic, width, height, pic->stride, width); // Search integer position costs[0] = kvz_satd_any_size(width, height, @@ -1229,20 +1229,16 @@ double *inter_cost, uint32_t *inter_bitcost) { - const int x_cu = x >> 3; - const int y_cu = y >> 3; const videoframe_t * const frame = state->tile->frame; kvz_picture *ref_image = state->frame->ref->images[ref_idx]; + const vector2d_t orig = { x, y }; uint32_t temp_bitcost = 0; uint32_t temp_cost = 0; - vector2d_t orig; int32_t merged = 0; uint8_t cu_mv_cand = 0; int8_t merge_idx = 0; int8_t ref_list = state->frame->refmap[ref_idx].list-1; int8_t temp_ref_idx = cur_cu->inter.mv_ref[ref_list]; - orig.x = x_cu * CU_MIN_SIZE_PIXELS; - orig.y = y_cu * CU_MIN_SIZE_PIXELS; // Get MV candidates cur_cu->inter.mv_ref[ref_list] = ref_idx; kvz_inter_get_mv_cand(state, x, y, width, height, mv_cand, cur_cu, lcu, ref_list); @@ -1274,7 +1270,7 @@ } int search_range = 32; - switch (state->encoder_control->cfg->ime_algorithm) { + switch (state->encoder_control->cfg.ime_algorithm) { case KVZ_IME_FULL64: search_range = 64; break; case KVZ_IME_FULL32: search_range = 32; break; case KVZ_IME_FULL16: search_range = 16; break; @@ -1282,7 +1278,7 @@ default: break; } - switch (state->encoder_control->cfg->ime_algorithm) { + switch (state->encoder_control->cfg.ime_algorithm) {
View file
kvazaar-1.0.0.tar.gz/src/search_intra.c -> kvazaar-1.1.0.tar.gz/src/search_intra.c
Changed
@@ -102,7 +102,7 @@ int width) { double satd_cost = satd_func(pred, orig_block); - if (TRSKIP_RATIO != 0 && width == 4 && state->encoder_control->trskip_enable) { + if (TRSKIP_RATIO != 0 && width == 4 && state->encoder_control->cfg.trskip_enable) { // If the mode looks better with SAD than SATD it might be a good // candidate for transform skip. How much better SAD has to be is // controlled by TRSKIP_RATIO. @@ -117,7 +117,7 @@ trskip_bits += 2.0 * (CTX_ENTROPY_FBITS(ctx, 1) - CTX_ENTROPY_FBITS(ctx, 0)); } - double sad_cost = TRSKIP_RATIO * sad_func(pred, orig_block) + state->frame->cur_lambda_cost_sqrt * trskip_bits; + double sad_cost = TRSKIP_RATIO * sad_func(pred, orig_block) + state->lambda_sqrt * trskip_bits; if (sad_cost < satd_cost) { return sad_cost; } @@ -145,7 +145,7 @@ costs_out[0] = (double)satd_costs[0]; costs_out[1] = (double)satd_costs[1]; - if (TRSKIP_RATIO != 0 && width == 4 && state->encoder_control->trskip_enable) { + if (TRSKIP_RATIO != 0 && width == 4 && state->encoder_control->cfg.trskip_enable) { // If the mode looks better with SAD than SATD it might be a good // candidate for transform skip. How much better SAD has to be is // controlled by TRSKIP_RATIO. @@ -164,7 +164,7 @@ double sad_costs[PARALLEL_BLKS] = { 0 }; sad_twin_func(preds, orig_block, PARALLEL_BLKS, unsigned_sad_costs); for (int i = 0; i < PARALLEL_BLKS; ++i) { - sad_costs[i] = TRSKIP_RATIO * (double)unsigned_sad_costs[i] + state->frame->cur_lambda_cost_sqrt * trskip_bits; + sad_costs[i] = TRSKIP_RATIO * (double)unsigned_sad_costs[i] + state->lambda_sqrt * trskip_bits; if (sad_costs[i] < (double)satd_costs[i]) { costs_out[i] = sad_costs[i]; } @@ -254,7 +254,7 @@ // max_depth. // - Min transform size hasn't been reached (MAX_PU_DEPTH). if (depth < max_depth && depth < MAX_PU_DEPTH) { - split_cost = 3 * state->frame->cur_lambda_cost; + split_cost = 3 * state->lambda; split_cost += search_intra_trdepth(state, x_px, y_px, depth + 1, max_depth, intra_mode, nosplit_cost, pred_cu, lcu); if (split_cost < nosplit_cost) { @@ -296,7 +296,7 @@ } double bits = tr_split_bit + cbf_bits; - split_cost += bits * state->frame->cur_lambda_cost; + split_cost += bits * state->lambda; } else { assert(width <= TR_MAX_WIDTH); } @@ -410,7 +410,7 @@ cost_pixel_nxn_multi_func *satd_dual_func = kvz_pixels_get_satd_dual_func(width); cost_pixel_nxn_multi_func *sad_dual_func = kvz_pixels_get_sad_dual_func(width); - const kvz_config *cfg = state->encoder_control->cfg; + const kvz_config *cfg = &state->encoder_control->cfg; const bool filter_boundary = !(cfg->lossless && cfg->implicit_rdpcm); // Temporary block arrays @@ -430,7 +430,7 @@ // Initial offset decides how many modes are tried before moving on to the // recursive search. int offset; - if (state->encoder_control->full_intra_search) { + if (state->encoder_control->cfg.full_intra_search) { offset = 1; } else { static const int8_t offsets[4] = { 2, 4, 8, 8 }; @@ -529,7 +529,7 @@ // Add prediction mode coding cost as the last thing. We don't want this // affecting the halving search. - int lambda_cost = (int)(state->frame->cur_lambda_cost_sqrt + 0.5); + int lambda_cost = (int)(state->lambda_sqrt + 0.5); for (int mode_i = 0; mode_i < modes_selected; ++mode_i) { costs[mode_i] += lambda_cost * kvz_luma_mode_bits(state, modes[mode_i], intra_preds); } @@ -573,7 +573,7 @@ int8_t modes[35], double costs[35], lcu_t *lcu) { - const int tr_depth = CLIP(1, MAX_PU_DEPTH, depth + state->encoder_control->tr_depth_intra); + const int tr_depth = CLIP(1, MAX_PU_DEPTH, depth + state->encoder_control->cfg.tr_depth_intra); const int width = LCU_WIDTH >> depth; kvz_pixel orig_block[LCU_WIDTH * LCU_WIDTH + 1]; @@ -600,7 +600,7 @@ for(int rdo_mode = 0; rdo_mode < modes_to_check; rdo_mode ++) { int rdo_bitcost = kvz_luma_mode_bits(state, modes[rdo_mode], intra_preds); - costs[rdo_mode] = rdo_bitcost * (int)(state->frame->cur_lambda_cost + 0.5); + costs[rdo_mode] = rdo_bitcost * (int)(state->lambda + 0.5); // Perform transform split search and save mode RD cost for the best one. cu_info_t pred_cu; @@ -701,7 +701,7 @@ chroma.cost = kvz_cu_rd_cost_chroma(state, lcu_px.x, lcu_px.y, depth, tr_cu, lcu); double mode_bits = kvz_chroma_mode_bits(state, chroma.mode, intra_mode); - chroma.cost += mode_bits * state->frame->cur_lambda_cost; + chroma.cost += mode_bits * state->lambda; if (chroma.cost < best_chroma.cost) { best_chroma = chroma; @@ -737,7 +737,7 @@ const int8_t modes_in_depth[5] = { 1, 1, 1, 1, 2 }; int num_modes = modes_in_depth[depth]; - if (state->encoder_control->rdo == 3) { + if (state->encoder_control->cfg.rdo == 3) { num_modes = 5; } @@ -819,7 +819,7 @@ kvz_pixel *ref_pixels = &lcu->ref.y[lcu_px.x + lcu_px.y * LCU_WIDTH]; int8_t number_of_modes; - bool skip_rough_search = (depth == 0 || state->encoder_control->rdo >= 3); + bool skip_rough_search = (depth == 0 || state->encoder_control->cfg.rdo >= 3); if (!skip_rough_search) { number_of_modes = search_intra_rough(state, ref_pixels, LCU_WIDTH, @@ -838,11 +838,12 @@ kvz_lcu_set_trdepth(lcu, x_px, y_px, depth, depth); // Refine results with slower search or get some results if rough search was skipped. - if (state->encoder_control->rdo >= 2 || skip_rough_search) { + const int32_t rdo_level = state->encoder_control->cfg.rdo; + if (rdo_level >= 2 || skip_rough_search) { int number_of_modes_to_search; - if (state->encoder_control->rdo == 3) { + if (rdo_level == 3) { number_of_modes_to_search = 35; - } else if (state->encoder_control->rdo == 2) { + } else if (rdo_level == 2) { number_of_modes_to_search = (cu_width <= 8) ? 8 : 3; } else { // Check only the predicted modes.
View file
kvazaar-1.0.0.tar.gz/src/strategies/avx2/picture-avx2.c -> kvazaar-1.1.0.tar.gz/src/strategies/avx2/picture-avx2.c
Changed
@@ -638,6 +638,82 @@ SATD_ANY_SIZE_MULTI_AVX2(quad_avx2, 4) + +static unsigned pixels_calc_ssd_avx2(const kvz_pixel *const ref, const kvz_pixel *const rec, + const int ref_stride, const int rec_stride, + const int width) +{ + __m256i ssd_part; + __m256i diff = _mm256_setzero_si256(); + __m128i sum; + + __m256i ref_epi16; + __m256i rec_epi16; + + __m128i ref_row0, ref_row1, ref_row2, ref_row3; + __m128i rec_row0, rec_row1, rec_row2, rec_row3; + + int ssd; + + switch (width) { + + case 4: + + ref_row0 = _mm_cvtsi32_si128(*(int32_t*)&(ref[0 * ref_stride])); + ref_row1 = _mm_cvtsi32_si128(*(int32_t*)&(ref[1 * ref_stride])); + ref_row2 = _mm_cvtsi32_si128(*(int32_t*)&(ref[2 * ref_stride])); + ref_row3 = _mm_cvtsi32_si128(*(int32_t*)&(ref[3 * ref_stride])); + + ref_row0 = _mm_unpacklo_epi32(ref_row0, ref_row1); + ref_row1 = _mm_unpacklo_epi32(ref_row2, ref_row3); + ref_epi16 = _mm256_cvtepu8_epi16(_mm_unpacklo_epi64(ref_row0, ref_row1) ); + + rec_row0 = _mm_cvtsi32_si128(*(int32_t*)&(rec[0 * rec_stride])); + rec_row1 = _mm_cvtsi32_si128(*(int32_t*)&(rec[1 * rec_stride])); + rec_row2 = _mm_cvtsi32_si128(*(int32_t*)&(rec[2 * rec_stride])); + rec_row3 = _mm_cvtsi32_si128(*(int32_t*)&(rec[3 * rec_stride])); + + rec_row0 = _mm_unpacklo_epi32(rec_row0, rec_row1); + rec_row1 = _mm_unpacklo_epi32(rec_row2, rec_row3); + rec_epi16 = _mm256_cvtepu8_epi16(_mm_unpacklo_epi64(rec_row0, rec_row1) ); + + diff = _mm256_sub_epi16(ref_epi16, rec_epi16); + ssd_part = _mm256_madd_epi16(diff, diff); + + sum = _mm_add_epi32(_mm256_castsi256_si128(ssd_part), _mm256_extracti128_si256(ssd_part, 1)); + sum = _mm_add_epi32(sum, _mm_shuffle_epi32(sum, _MM_SHUFFLE(1, 0, 3, 2))); + sum = _mm_add_epi32(sum, _mm_shuffle_epi32(sum, _MM_SHUFFLE(0, 1, 0, 1))); + + ssd = _mm_cvtsi128_si32(sum); + + return ssd >> (2*(KVZ_BIT_DEPTH-8)); + break; + + default: + + ssd_part = _mm256_setzero_si256(); + for (int y = 0; y < width; y += 8) { + for (int x = 0; x < width; x += 8) { + for (int i = 0; i < 8; i += 2) { + ref_epi16 = _mm256_cvtepu8_epi16(_mm_unpacklo_epi64(_mm_loadl_epi64((__m128i*)&(ref[x + (y + i) * ref_stride])), _mm_loadl_epi64((__m128i*)&(ref[x + (y + i + 1) * ref_stride])))); + rec_epi16 = _mm256_cvtepu8_epi16(_mm_unpacklo_epi64(_mm_loadl_epi64((__m128i*)&(rec[x + (y + i) * rec_stride])), _mm_loadl_epi64((__m128i*)&(rec[x + (y + i + 1) * rec_stride])))); + diff = _mm256_sub_epi16(ref_epi16, rec_epi16); + ssd_part = _mm256_add_epi32(ssd_part, _mm256_madd_epi16(diff, diff)); + } + } + } + + sum = _mm_add_epi32(_mm256_castsi256_si128(ssd_part), _mm256_extracti128_si256(ssd_part, 1)); + sum = _mm_add_epi32(sum, _mm_shuffle_epi32(sum, _MM_SHUFFLE(1, 0, 3, 2))); + sum = _mm_add_epi32(sum, _mm_shuffle_epi32(sum, _MM_SHUFFLE(0, 1, 0, 1))); + + ssd = _mm_cvtsi128_si32(sum); + + return ssd >> (2*(KVZ_BIT_DEPTH-8)); + break; + } +} + #endif //COMPILE_INTEL_AVX2 @@ -667,7 +743,9 @@ success &= kvz_strategyselector_register(opaque, "satd_32x32_dual", "avx2", 40, &satd_8bit_32x32_dual_avx2); success &= kvz_strategyselector_register(opaque, "satd_64x64_dual", "avx2", 40, &satd_8bit_64x64_dual_avx2); success &= kvz_strategyselector_register(opaque, "satd_any_size", "avx2", 40, &satd_any_size_8bit_avx2); - success &= kvz_strategyselector_register(opaque, "satd_any_size_quad", "generic", 40, &satd_any_size_quad_avx2); + success &= kvz_strategyselector_register(opaque, "satd_any_size_quad", "avx2", 40, &satd_any_size_quad_avx2); + + success &= kvz_strategyselector_register(opaque, "pixels_calc_ssd", "avx2", 40, &pixels_calc_ssd_avx2); } #endif return success;
View file
kvazaar-1.0.0.tar.gz/src/strategies/avx2/quant-avx2.c -> kvazaar-1.1.0.tar.gz/src/strategies/avx2/quant-avx2.c
Changed
@@ -52,7 +52,7 @@ const uint32_t log2_block_size = kvz_g_convert_to_bit[width] + 2; const uint32_t * const scan = kvz_g_sig_last_scan[scan_idx][log2_block_size - 1]; - int32_t qp_scaled = kvz_get_scaled_qp(type, state->frame->QP, (encoder->bitdepth - 8) * 6); + int32_t qp_scaled = kvz_get_scaled_qp(type, state->qp, (encoder->bitdepth - 8) * 6); const uint32_t log2_tr_size = kvz_g_convert_to_bit[width] + 2; const int32_t scalinglist_type = (block_type == CU_INTRA ? 0 : 3) + (int8_t)("\0\3\1\2"[type]); const int32_t *quant_coeff = encoder->scaling_list.quant_coeff[log2_tr_size - 2][scalinglist_type][qp_scaled % 6]; @@ -104,7 +104,7 @@ temp = _mm_add_epi32(temp, _mm_shuffle_epi32(temp, _MM_SHUFFLE(0, 1, 0, 1))); ac_sum += _mm_cvtsi128_si32(temp); - if (!(encoder->sign_hiding && ac_sum >= 2)) return; + if (!encoder->cfg.signhide_enable || ac_sum < 2) return; int32_t delta_u[LCU_WIDTH*LCU_WIDTH >> 2]; @@ -380,25 +380,23 @@ } // Quantize coeffs. (coeff -> quant_coeff) - if (state->encoder_control->rdoq_enable && (width > 4 || !state->encoder_control->cfg->rdoq_skip)) { + if (state->encoder_control->cfg.rdoq_enable && + (width > 4 || !state->encoder_control->cfg.rdoq_skip)) + { int8_t tr_depth = cur_cu->tr_depth - cur_cu->depth; tr_depth += (cur_cu->part_size == SIZE_NxN ? 1 : 0); kvz_rdoq(state, coeff, quant_coeff, width, width, (color == COLOR_Y ? 0 : 2), scan_order, cur_cu->type, tr_depth); - } - else { + } else { kvz_quant(state, coeff, quant_coeff, width, width, (color == COLOR_Y ? 0 : 2), scan_order, cur_cu->type); } // Check if there are any non-zero coefficients. - { - int i; - for (i = 0; i < width * width; i+=8) { - __m128i v_quant_coeff = _mm_loadu_si128((__m128i*)&(quant_coeff[i])); - has_coeffs = !_mm_testz_si128(_mm_set1_epi8(0xFF), v_quant_coeff); - if(has_coeffs) break; - } + for (int i = 0; i < width * width; i += 8) { + __m128i v_quant_coeff = _mm_loadu_si128((__m128i*)&(quant_coeff[i])); + has_coeffs = !_mm_testz_si128(_mm_set1_epi8(0xFF), v_quant_coeff); + if(has_coeffs) break; } // Copy coefficients to coeff_out. @@ -457,7 +455,7 @@ int32_t n; int32_t transform_shift = 15 - encoder->bitdepth - (kvz_g_convert_to_bit[ width ] + 2); - int32_t qp_scaled = kvz_get_scaled_qp(type, state->frame->QP, (encoder->bitdepth-8)*6); + int32_t qp_scaled = kvz_get_scaled_qp(type, state->qp, (encoder->bitdepth-8)*6); shift = 20 - QUANT_SHIFT - transform_shift;
View file
kvazaar-1.0.0.tar.gz/src/strategies/generic/picture-generic.c -> kvazaar-1.1.0.tar.gz/src/strategies/generic/picture-generic.c
Changed
@@ -518,6 +518,23 @@ SAD_DUAL_NXN(32, kvz_pixel) SAD_DUAL_NXN(64, kvz_pixel) +static unsigned pixels_calc_ssd_generic(const kvz_pixel *const ref, const kvz_pixel *const rec, + const int ref_stride, const int rec_stride, + const int width) +{ + int ssd = 0; + int y, x; + + for (y = 0; y < width; ++y) { + for (x = 0; x < width; ++x) { + int diff = ref[x + y * ref_stride] - rec[x + y * rec_stride]; + ssd += diff * diff; + } + } + + return ssd >> (2*(KVZ_BIT_DEPTH-8)); +} + int kvz_strategy_register_picture_generic(void* opaque, uint8_t bitdepth) { @@ -551,5 +568,7 @@ success &= kvz_strategyselector_register(opaque, "satd_any_size", "generic", 0, &satd_any_size_generic); success &= kvz_strategyselector_register(opaque, "satd_any_size_quad", "generic", 0, &satd_any_size_quad_generic); + success &= kvz_strategyselector_register(opaque, "pixels_calc_ssd", "generic", 0, &pixels_calc_ssd_generic); + return success; }
View file
kvazaar-1.0.0.tar.gz/src/strategies/generic/quant-generic.c -> kvazaar-1.1.0.tar.gz/src/strategies/generic/quant-generic.c
Changed
@@ -41,7 +41,7 @@ const uint32_t log2_block_size = kvz_g_convert_to_bit[width] + 2; const uint32_t * const scan = kvz_g_sig_last_scan[scan_idx][log2_block_size - 1]; - int32_t qp_scaled = kvz_get_scaled_qp(type, state->frame->QP, (encoder->bitdepth - 8) * 6); + int32_t qp_scaled = kvz_get_scaled_qp(type, state->qp, (encoder->bitdepth - 8) * 6); const uint32_t log2_tr_size = kvz_g_convert_to_bit[width] + 2; const int32_t scalinglist_type = (block_type == CU_INTRA ? 0 : 3) + (int8_t)("\0\3\1\2"[type]); const int32_t *quant_coeff = encoder->scaling_list.quant_coeff[log2_tr_size - 2][scalinglist_type][qp_scaled % 6]; @@ -66,7 +66,7 @@ q_coef[n] = (coeff_t)(CLIP(-32768, 32767, level)); } - if (!(encoder->sign_hiding && ac_sum >= 2)) return; + if (!encoder->cfg.signhide_enable || ac_sum < 2) return; int32_t delta_u[LCU_WIDTH*LCU_WIDTH >> 2]; @@ -213,13 +213,14 @@ } // Quantize coeffs. (coeff -> quant_coeff) - if (state->encoder_control->rdoq_enable && (width > 4 || !state->encoder_control->cfg->rdoq_skip)) { + if (state->encoder_control->cfg.rdoq_enable && + (width > 4 || !state->encoder_control->cfg.rdoq_skip)) + { int8_t tr_depth = cur_cu->tr_depth - cur_cu->depth; tr_depth += (cur_cu->part_size == SIZE_NxN ? 1 : 0); kvz_rdoq(state, coeff, quant_coeff, width, width, (color == COLOR_Y ? 0 : 2), scan_order, cur_cu->type, tr_depth); - } - else { + } else { kvz_quant(state, coeff, quant_coeff, width, width, (color == COLOR_Y ? 0 : 2), scan_order, cur_cu->type); } @@ -286,7 +287,7 @@ int32_t n; int32_t transform_shift = 15 - encoder->bitdepth - (kvz_g_convert_to_bit[ width ] + 2); - int32_t qp_scaled = kvz_get_scaled_qp(type, state->frame->QP, (encoder->bitdepth-8)*6); + int32_t qp_scaled = kvz_get_scaled_qp(type, state->qp, (encoder->bitdepth-8)*6); shift = 20 - QUANT_SHIFT - transform_shift;
View file
kvazaar-1.0.0.tar.gz/src/strategies/strategies-picture.c -> kvazaar-1.1.0.tar.gz/src/strategies/strategies-picture.c
Changed
@@ -59,6 +59,8 @@ cost_pixel_any_size_func * kvz_satd_any_size = 0; cost_pixel_any_size_multi_func * kvz_satd_any_size_quad = 0; +pixels_calc_ssd_func * kvz_pixels_calc_ssd = 0; + int kvz_strategy_register_picture(void* opaque, uint8_t bitdepth) { bool success = true;
View file
kvazaar-1.0.0.tar.gz/src/strategies/strategies-picture.h -> kvazaar-1.1.0.tar.gz/src/strategies/strategies-picture.h
Changed
@@ -110,6 +110,7 @@ typedef void (cost_pixel_nxn_multi_func)(const pred_buffer preds, const kvz_pixel *orig, unsigned num_modes, unsigned *costs_out); typedef void (cost_pixel_any_size_multi_func)(int width, int height, const kvz_pixel **preds, const int *strides, const kvz_pixel *orig, const int orig_stride, unsigned num_modes, unsigned *costs_out, int8_t *valid); +typedef unsigned (pixels_calc_ssd_func)(const kvz_pixel *const ref, const kvz_pixel *const rec, const int ref_stride, const int rec_stride, const int width); // Declare function pointers. extern reg_sad_func * kvz_reg_sad; @@ -141,6 +142,8 @@ extern cost_pixel_any_size_multi_func *kvz_satd_any_size_quad; +extern pixels_calc_ssd_func *kvz_pixels_calc_ssd; + int kvz_strategy_register_picture(void* opaque, uint8_t bitdepth); cost_pixel_nxn_func * kvz_pixels_get_satd_func(unsigned n); cost_pixel_nxn_func * kvz_pixels_get_sad_func(unsigned n); @@ -171,6 +174,7 @@ {"satd_32x32_dual", (void**) &kvz_satd_32x32_dual}, \ {"satd_64x64_dual", (void**) &kvz_satd_64x64_dual}, \ {"satd_any_size_quad", (void**) &kvz_satd_any_size_quad}, \ + {"pixels_calc_ssd", (void**) &kvz_pixels_calc_ssd}, \
View file
kvazaar-1.0.0.tar.gz/src/threadqueue.c -> kvazaar-1.1.0.tar.gz/src/threadqueue.c
Changed
@@ -458,17 +458,8 @@ notdone = threadqueue->queue_waiting_execution + threadqueue->queue_waiting_dependency + threadqueue->queue_running; if (notdone > 0) { - int ret; PTHREAD_COND_BROADCAST(&(threadqueue->cond)); - - struct timespec wait_moment; - ms_from_now_timespec(&wait_moment, 100); - ret = pthread_cond_timedwait(&threadqueue->cb_cond, &threadqueue->lock, &wait_moment); - if (ret != 0 && ret != ETIMEDOUT) { - fprintf(stderr, "pthread_cond_timedwait failed!\n"); - assert(0); - return 0; - } + PTHREAD_COND_WAIT(&threadqueue->cb_cond, &threadqueue->lock); } } while (notdone > 0); @@ -496,16 +487,8 @@ PTHREAD_UNLOCK(&job->lock); if (!job_done) { - int ret; PTHREAD_COND_BROADCAST(&(threadqueue->cond)); - struct timespec wait_moment; - ms_from_now_timespec(&wait_moment, 100); - ret = pthread_cond_timedwait(&threadqueue->cb_cond, &threadqueue->lock, &wait_moment); - if (ret != 0 && ret != ETIMEDOUT) { - fprintf(stderr, "pthread_cond_timedwait failed!\n"); - assert(0); - return 0; - } + PTHREAD_COND_WAIT(&threadqueue->cb_cond, &threadqueue->lock); } } while (!job_done);
View file
kvazaar-1.0.0.tar.gz/src/threads.h -> kvazaar-1.1.0.tar.gz/src/threads.h
Changed
@@ -42,41 +42,21 @@ #ifdef __MACH__ // Workaround Mac OS not having clock_gettime. -#include <mach/clock.h> // IWYU pragma: export -#include <mach/mach.h> // IWYU pragma: export -#define KVZ_GET_TIME(clock_t) { \ - clock_serv_t cclock; \ - mach_timespec_t mts; \ - host_get_clock_service(mach_host_self(), SYSTEM_CLOCK, &cclock); \ - clock_get_time(cclock, &mts); \ - mach_port_deallocate(mach_task_self(), cclock); \ - (clock_t)->tv_sec = mts.tv_sec; \ - (clock_t)->tv_nsec = mts.tv_nsec; \ -} +// This needs to work with pthread_cond_timedwait. +# include <sys/time.h> +# define KVZ_GET_TIME(clock_t) { \ + struct timeval tv; \ + gettimeofday(&tv, NULL); \ + (clock_t)->tv_sec = tv.tv_sec; \ + (clock_t)->tv_nsec = tv.tv_usec * 1000; \ + } #else -#define KVZ_GET_TIME(clock_t) { clock_gettime(CLOCK_MONOTONIC, (clock_t)); } +# define KVZ_GET_TIME(clock_t) { clock_gettime(CLOCK_MONOTONIC, (clock_t)); } #endif #define KVZ_CLOCK_T_AS_DOUBLE(ts) ((double)((ts).tv_sec) + (double)((ts).tv_nsec) / 1e9) #define KVZ_CLOCK_T_DIFF(start, stop) ((double)((stop).tv_sec - (start).tv_sec) + (double)((stop).tv_nsec - (start).tv_nsec) / 1e9) -static INLINE struct timespec * ms_from_now_timespec(struct timespec * result, int wait_ms) -{ - KVZ_GET_TIME(result); - int64_t secs = result->tv_sec + wait_ms / E3; - int64_t nsecs = result->tv_nsec + (wait_ms % E3) * (E9 / E3); - - if (nsecs >= E9) { - secs += 1; - nsecs -= E9; - } - - result->tv_sec = secs; - result->tv_nsec = nsecs; - - return result; -} - #define KVZ_ATOMIC_INC(ptr) __sync_add_and_fetch((volatile int32_t*)ptr, 1) #define KVZ_ATOMIC_DEC(ptr) __sync_add_and_fetch((volatile int32_t*)ptr, -1) @@ -91,28 +71,6 @@ #define KVZ_CLOCK_T_DIFF(start, stop) ((double)((((uint64_t)(stop).dwHighDateTime)<<32 | (uint64_t)(stop).dwLowDateTime) - \ (((uint64_t)(start).dwHighDateTime)<<32 | (uint64_t)(start).dwLowDateTime)) / 1e7) -static INLINE struct timespec * ms_from_now_timespec(struct timespec * result, int wait_ms) -{ - KVZ_CLOCK_T now; - KVZ_GET_TIME(&now); - - int64_t moment_100ns = (int64_t)now.dwHighDateTime << 32 | (int64_t)now.dwLowDateTime; - moment_100ns -= (int64_t)FILETIME_TO_EPOCH; - - int64_t secs = moment_100ns / (E9 / 100) + (wait_ms / E3); - int64_t nsecs = (moment_100ns % (E9 / 100))*100 + ((wait_ms % E3) * (E9 / E3)); - - if (nsecs >= E9) { - secs += 1; - nsecs -= E9; - } - - result->tv_sec = secs; - result->tv_nsec = nsecs; - - return result; -} - #define KVZ_ATOMIC_INC(ptr) InterlockedIncrement((volatile LONG*)ptr) #define KVZ_ATOMIC_DEC(ptr) InterlockedDecrement((volatile LONG*)ptr)
View file
kvazaar-1.0.0.tar.gz/src/transform.c -> kvazaar-1.1.0.tar.gz/src/transform.c
Changed
@@ -25,6 +25,7 @@ #include "rdo.h" #include "strategies/strategies-dct.h" #include "strategies/strategies-quant.h" +#include "strategies/strategies-picture.h" #include "tables.h" /** @@ -231,7 +232,7 @@ int has_coeffs; } skip, noskip, *best; - const int bit_cost = (int)(state->frame->cur_lambda_cost+0.5); + const int bit_cost = (int)(state->lambda + 0.5); noskip.has_coeffs = kvz_quantize_residual( state, cur_cu, width, color, scan_order, @@ -341,14 +342,14 @@ cbf_clear(&cur_pu->cbf, depth, COLOR_Y); - if (state->encoder_control->cfg->lossless) { + if (state->encoder_control->cfg.lossless) { if (bypass_transquant(width, LCU_WIDTH, LCU_WIDTH, base_y, recbase_y, recbase_y, orig_coeff_y)) { cbf_set(&cur_pu->cbf, depth, COLOR_Y); } - if (state->encoder_control->cfg->implicit_rdpcm && cur_pu->type == CU_INTRA) { + if (state->encoder_control->cfg.implicit_rdpcm && cur_pu->type == CU_INTRA) { // implicit rdpcm for horizontal and vertical intra modes if (cur_pu->intra.mode == 10) { rdpcm(width, LCU_WIDTH, RDPCM_HOR, orig_coeff_y); @@ -357,7 +358,7 @@ rdpcm(width, LCU_WIDTH, RDPCM_VER, orig_coeff_y); } } - } else if (width == 4 && state->encoder_control->trskip_enable) { + } else if (width == 4 && state->encoder_control->cfg.trskip_enable) { // Try quantization with trskip and use it if it's better. int has_coeffs = kvz_quantize_residual_trskip( state, cur_pu, width, COLOR_Y, scan_idx_luma, @@ -438,7 +439,7 @@ scan_idx_chroma = kvz_get_scan_order(cur_cu->type, cur_cu->intra.mode_chroma, depth); - if (state->encoder_control->cfg->lossless) { + if (state->encoder_control->cfg.lossless) { if (bypass_transquant(chroma_width, LCU_WIDTH_C, LCU_WIDTH_C, base_u, recbase_u, @@ -451,7 +452,7 @@ recbase_v, orig_coeff_v)) { cbf_set(&cur_cu->cbf, depth, COLOR_V); } - if (state->encoder_control->cfg->implicit_rdpcm && cur_cu->type == CU_INTRA) { + if (state->encoder_control->cfg.implicit_rdpcm && cur_cu->type == CU_INTRA) { // implicit rdpcm for horizontal and vertical intra modes if (cur_cu->intra.mode_chroma == 10) { rdpcm(chroma_width, LCU_WIDTH_C, RDPCM_HOR, orig_coeff_u);
View file
kvazaar-1.0.0.tar.gz/src/yuv_io.c -> kvazaar-1.1.0.tar.gz/src/yuv_io.c
Changed
@@ -79,12 +79,14 @@ static void shift_to_bitdepth(kvz_pixel* input, int size, int from_bitdepth, int to_bitdepth) { int shift = to_bitdepth - from_bitdepth; + kvz_pixel bitdepth_mask = (1 << from_bitdepth) - 1; + for (int i = 0; i < size; ++i) { // Shifting by a negative number is undefined. if (shift > 0) { - input[i] <<= shift; + input[i] = (input[i] & bitdepth_mask) << shift; } else { - input[i] >>= shift; + input[i] = (input[i] & bitdepth_mask) >> shift; } } } @@ -99,6 +101,7 @@ assert(sizeof(kvz_pixel) > 1); int shift = to_bitdepth - from_bitdepth; unsigned char *byte_buf = (unsigned char *)input; + kvz_pixel bitdepth_mask = (1 << from_bitdepth) - 1; // Starting from the back of the 1-byte samples, copy each sample to it's // place in the 2-byte per sample array, overwriting the bytes that have @@ -109,20 +112,33 @@ for (int i = size - 1; i >= 0; --i) { // Shifting by a negative number is undefined. if (shift > 0) { - input[i] = byte_buf[i] << shift; + input[i] = (byte_buf[i] & bitdepth_mask) << shift; } else { - input[i] = byte_buf[i] >> shift; + input[i] = (byte_buf[i] & bitdepth_mask) >> shift; } } } -bool machine_is_big_endian() +static bool machine_is_big_endian() { + // Big and little endianess refers to which end of the egg you prefer to eat + // first. Therefore in big endian system, the most significant bits are in + // the first address. + uint16_t number = 1; char first_byte = *(char*)&number; - return (first_byte != 0); + return (first_byte == 0); +} + + +static void mask_to_bitdepth(kvz_pixel *buf, unsigned length, unsigned bitdepth) +{ + kvz_pixel bitdepth_mask = (1 << bitdepth) - 1; + for (int i = 0; i < length; ++i) { + buf[i] = buf[i] & bitdepth_mask; + } } @@ -133,8 +149,8 @@ kvz_pixel *out_buf) { unsigned bytes_per_sample = in_bitdepth > 8 ? 2 : 1; - unsigned buf_length = in_width * in_height; - unsigned buf_bytes = buf_length * bytes_per_sample; + unsigned buf_bytes = in_width * in_height * bytes_per_sample; + unsigned out_length = out_width * out_height; if (in_width == out_width) { // No need to extend pixels. @@ -151,17 +167,23 @@ } if (in_bitdepth > 8) { + // Assume little endian input. if (machine_is_big_endian()) { - swap_16b_buffer_bytes(out_buf, buf_length); + swap_16b_buffer_bytes(out_buf, out_length); } } + // Shift the data to the correct bitdepth. + // Ignore any bits larger than in_bitdepth to guarantee ouput data will be + // in the correct range. if (in_bitdepth <= 8 && out_bitdepth > 8) { - shift_to_bitdepth_and_spread(out_buf, buf_length, in_bitdepth, out_bitdepth); + shift_to_bitdepth_and_spread(out_buf, out_length, in_bitdepth, out_bitdepth); } else if (in_bitdepth != out_bitdepth) { - shift_to_bitdepth(out_buf, buf_length, in_bitdepth, out_bitdepth); + shift_to_bitdepth(out_buf, out_length, in_bitdepth, out_bitdepth); + } else if (in_bitdepth % 8 != 0) { + mask_to_bitdepth(out_buf, out_length, out_bitdepth); } - + return 1; }
View file
kvazaar-1.1.0.tar.gz/tools/appveyor-build.sh
Added
@@ -0,0 +1,12 @@ +#!/usr/bin/bash +set -e + +export CC=gcc + +./autogen.sh +./configure \ + --host=$MINGW_CHOST \ + --build=$MINGW_CHOST \ + --target=$MINGW_CHOST \ + --disable-shared --enable-static +make
View file
kvazaar-1.1.0.tar.gz/tools/appveyor-install.sh
Added
@@ -0,0 +1,10 @@ +#!/usr/bin/bash +set -e + +# Install build dependencies for kvazaar +pacman -S --noconfirm --noprogressbar --needed \ + $MINGW_PACKAGE_PREFIX-gcc \ + $MINGW_PACKAGE_PREFIX-yasm + +# Delete unused packages to reduce space used in the Appveyor cache +pacman -Sc --noconfirm
View file
kvazaar-1.0.0.tar.gz/tools/genmanpage.sh -> kvazaar-1.1.0.tar.gz/tools/genmanpage.sh
Changed
@@ -21,15 +21,13 @@ ../src/kvazaar --help 2>&1 | tail -n+5 | head -n-4 | \ sed 's| : |\n|g; s| :$||g; - s|^ --|.TP\n\\fB--|g; s|^ --|.TP\n\\fB--|g; - s|^ -|.TP\n\\fB-|g; - s|^ ||g; - s|^ ||g; + s|^ -|.TP\n\\fB-|g; + s|^ ||g; s|-|\\-|g; s|, \\-\\-|\\fR, \\fB\\-\\-|g;' \ >> $manpage_file -for s in Slices Wpp Tiles "Parallel processing" "Video Usability Information"; do - sed -i "s|^ ${s}:|.SS \"${s}:\"|g" $manpage_file +for s in Required Presets Input Options "Video structure" "Compression tools" "Parallel processing" "Video Usability Information"; do + sed -i "s|^${s}:|.SS \"${s}:\"|g" $manpage_file done
Locations
Projects
Search
Status Monitor
Help
Open Build Service
OBS Manuals
API Documentation
OBS Portal
Reporting a Bug
Contact
Mailing List
Forums
Chat (IRC)
Twitter
Open Build Service (OBS)
is an
openSUSE project
.